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To  extend  the  lower  bound  of  power  supply  voltage,  we 
propose  a  Variable  Threshold  Voltage  MOSFET  (VTMOS) 
built  on  Silicon-On-Insulator  (SOI).  Threshold  voltage  of 
VTMOS  drops  as  gate  voltage  is  raised,  resulting  in  a  much 
higher  current  drive  than  regular  MOSFET,  at  low  V^^. 
On  the  other  hand,  V^  is  high  at  Vgj=0,  thus  the  leakage 
current  is  low. 

The  SOI  devices  used  in  the  study  were  built  on  SIMOX 
wafers.  A  four  terminal  layout  was  used  to  provide 
separate  source,  drain,  gate,  and  body  contacts.  In  addition 
to  the  four-terminal  layout,  devices  with  local  gate-to-body 
connections  were  also  fabricated  as  illustrated  in  Fig.  1. 
This  connection  uses  an  oversized  metal  to  P-t-  contact 
window  aligned  over  a  "hole"  in  the  poly  gate  [1].  The 
metal  shorts  the  gate  and  P+  region.  Thus,  there  is  no 
significant  penalty  in  area. 

To  operate  the  VTMOS,  floating  body  and  gate  of  a  SOI 
MOSFET  are  tied  together.  This  is  not  a  new 
configuration,  as  [1-3]  have  already  suggested  it.  However, 
[1-3]  all  tried  to  exploit  the  extra  current  produced  by  the 
lateral  bipolar  transistor.  This  normally  requires  the  body 
voltage  to  be  larger  than  0.6V.  Since  current  gain  of  the 
bipolar  device  is  small,  extra  drain  (collector)  current 
comes  at  cost  of  excessive  input  (base)  current,  which 
contributes  to  the  standby  current.  We  will  show  that  most 
of  the  improvement  can  be  achieved  when  gate  and  body 
voltages  are  kept  below  0.6V.  This  also  ensures  that  base 
current  will  stay  negligible.  Although  the  same  idea  can  be 
used  in  bulk  devices,  better  advantage  is  reached  in  SOI, 
where  because  of  very  small  junction  areas  base  current 
and  capacitances  are  appreciably  reduced. 

Fig.  2  illustrates  the  NMOS  behavior,  with  a  separate 
terminal  used  to  control  the  body  voltage.  The  threshold 
voltage  at  zero  body  bias  is  denoted  by  V,g.  Body  bias 
effect  is  normally  studied  in  the  reverse  bias  regime,  where 
threshold  voltage  increases  as  body  to  source  reverse  bias  is 
made  larger.  We  propose  to  use  the  exact  opposite  regime. 
Namely,  we  "forward  bias"  the  body-source  junction  (at 
less  than  0.6V),  forcing  the  threshold  voltage  to  drop. 

Specifically,  this  forward  bias  effect  is  achieved  by 
connecting  the  gate  to  the  body.  This  is  shown  as  Vg5=V(,5 
line  in  Fig.  2.  The  intersect  of  V,  curve  and  Vgj=Vjjj  line, 
which  is  marked  as  V,f,  is  the  VTMOS  threshold  voltage. 
This  lower  threshold  voltage- does  not  come  at  expense  of 
higher  off-state  leakage  current,  because  at  Vj,j=Vgj=0 


VTMOS  and  regular  device  have  the  same  V,.  In  fact,  they 
are  identical  in  all  respecu  and  consequently  have  the  same 
leakage  This  is  clearly  seen  in  Fig.  3.  Reduced 
compared  to  V^^  is  attained  through  a  theoretically  ideal 
subthrcshold  swing  of  60mV/dec.  Fig.  3  demonstrates  this 
for  PMOS  and  NMOS  devices  operated  in  VTMOS  mode 
and  in  regular  mode 

This  IS  not  the  only  improvement.  As  the  gate  of 
VTMOS  IS  raised  above  V^,  threshold  voltage  drops 
further  For  example,  for  tech-B  in  Fig.  2,  at 
=0  6V,  V,=0  18V  compared  to  V,p=0.4V.  In 
operation  the  upper  bound  for  applied  Vgj=Vjjj  is 
set  by  the  amount  of  base  current  that  can  be  tolerated. 
This  is  illustrated  in  Fig.  3.  where  PMOS  and  NMOS 
device  body  (base)  currents  are  shown.  At  Vgj=0.6V  base 
currents  for  both  PMOS  and  NMOS  devices  arc  less  than 
2nA/pm.  Current  drives  of  VTMOS  and  regular  MOSFET 
arc  compared  in  Fig.  4,  for  tech-B  of  Fig.  2.  VTMOS  drain 
current  is  2.5  times  of  regular  device  at  Vgj=0.6V,  and  5.5 
times  of  regular  device  at  V  =0.3V. 

AC  performance  of  VTMOS  is  evaluated  by  an  unloaded 
101  stage  CMOS  ring  oscillator,  shown  in  Fig.  5.  We 
emphasize  that  since  the  threshold  voltages  of  devices  used 
in  the  ring  oscillator  were  high  (tech-A),  the  optimum 
performance  was  not  achieved.  For  tech-B,  ring  oscillators 
are  not  available.  If  the  devices  based  on  tech-B  are  used, 
the  expected  delay  for  unloaded  ring  oscillator  can  be 


gs  w 

VTMOS 


C  1 

calculated  by:  T.j  =  — Vjj(-- 
4  4 


r  f 

dsatn  *dsatp 


).  This  is  shown 


as  solid  squares  in  Fig.  5,  where  C=200fF  is  used  for 
Wp=5pm  and  Wp=10jim.  This  value  for  C  was  obtained  by 
fitting  the  equation  to  the  measured  Tp^  of  tech-A.  Fig.  6 
illustrates  the  inverter  DC  transfer  characteristics  of  tech-B. 
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Fig.  1  a)  Cross  section  of  an  SOI  NMOSFET  with  body  arvl 
gate  tied  together,  b)  Gate  to  body  connection  by  using 
aluminum  to  short  the  gate  and  P+  region. 


Body  Bias,  Vjij  (V) 


Fig.  2  Threshold  Voltage  of  SOI  NMOSFET  as  a  function  of 
body-source  forward  bias.  For  Tech-A  Tg^lOnm, 
Na=2.0x10^^cm-3.  ForTech-B  Tox=6.4nm  Na=2.3x10’^cm'3. 


Fig.  4  Dram  currant  of  an  SOI  NMOSFET  operated  as  a 
VTMOS  and  as  a  regutar  device. 


Fig.  5  Delay  of  a  101-stage  ring  oscillator.  The  PMOS  and 
NMOS  devices  in  the  ring  are  VTMOS  with  Tg^lOnm,  and 
L0j{=O.3pm,  Vj,=0.6V.  Solid  Squares  show  the  predicted 
delay  for  a  ring  oscillator  based  on  Tech-B  with  Lg(f=0.3pm. 


Fig.  3  Subthreshold  characteristics  of  SOI  NMOSFET  and 
PMOSFET,  with  body  grounded  and  body  tied  to  the  gate. 


Fig.  6  Inverter  DC  transfer  characterisctics.  PMOS  and 
NMOS  devices  forming  the  inverter  are  VTMOS. 
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Abstract — PMOS  transistors  with  effective  channel  lengths 
down  to  0.15  fim  have  been  fabricated  on  silicon^n-insulator 
(SOI)  films.  Gate  oxide  thicknesses  of  5J  and  10  nm  are  used. 
These  gate  PMOS  devices  exhibit  excellent  short-channel 
behavior,  low  source-drain  resisUnce,  and  remarkably  large 
oirrent  drive  and  transconductance.  For  -  5.5  nm,  satura¬ 
tion  transconductances  of  274  mS/mm  at  300  K  and  352 
mS/mm  at  80  K  are  achieved,  which  are  the  highest  reported 
values  for  this  oxide  thickness.  The  result  is  attributed  to  low 
series  resisUnce,  forward-bias  body  effect,  and  the  reduction  of 
body  charge  effect 


I.  Introduction 

POTENTIAL  advantages  of  MOS  transistors  built  in 
thin  SOI  films  include  less  process  complexity,  re¬ 
duced  parasitic  capacitances,  improved  short  channel  ef¬ 
fects,  absence  of  latch-up,  and  higher  transconductance 
and  current  drive.  However,  to  date  very  few  successful 
experimental  results  have  been  reported  to  substantiate 
improved  current  drive.  Often  high  parasitic  series  resis¬ 
tance  has  obscured  this  advantage  [1].  Here,  for  the  first 
time,  we  report  experimental  results  for  deep-submi¬ 
crometer  SOI  PMOSFETs  with  improved  performance 
over  their  bulk  counterparts. 

II.  Device  Fabrication 

A  full  description  of  the  process  integration  is  given  in 
12].  Here,  we  provide  only  the  key  processing  steps.  SIMOX 
substrates  with  a  final  SOI  film  thickness  of  130  nm  were 
used.  The  130-nm  film  thickness  permits  low  device  series 
resistance  without  using  silicidation.  Also,  by  avoiding 
ultrathin  films,  desired  threshold  voltage  can  be  easily 
achieved.  Mesas  were  created  by  plasma  etching  a  ni¬ 
tride/  oxide/ silicon  stack  stopping  at  buried  oxide.  Next  a 
100-nm  oxide  was  grown  on  the  mesa  sidewalls  to  prevent 
low-p;  edge  devices  and  gate  oxide  defects  at  the  mesa 
comers.  Threshold  implants  were  then  performed,  result¬ 
ing  in  concentrations  of  1-3  X  lO”  cm~^.  Gate  oxides  of 
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Fig.  1.  /-Fcharactembcs  of  a  PMOSFETwith  IF/L^  -  93  Min/0.2 

>un  and  T,,  -  53  nm. 

SS  and  10  nm  thiclmsss  were  grown,  followed  by  the 
deposition  of  280  nm  of  undoped  polysilicon.  Doping  of 
the  poly  gate  was  rcalurd  by  a  30-keV  5  x  10'^  cm** 
boron  implant  The  combination  of  p^-poly  gate,  silicon 
film  thickness,  and  doping  concentration  resulted  in  nearly 
fuUy  depleted  (NFD)  devices  with  threshold  voltage  range 
of  -0.3  to  -OJ  V.  These  threshold  voltages  are  consis¬ 
tent  with  intended  low-voltage  operation  of  the  devices. 
Effective  channel  lengths  as  short  as  0.08  fim  were  ob¬ 
tained  by  ©2  ashing  of  the  gate  photoresist  [3]. 

ni.  Device  Performance 

We  note  that  all  reported  channel  lengths  here  are  the 
effective  channel  lengths,  determined  from  standard  con- 
duaivity  measurement,  not  the  mask  lengths.  Fig.  1  shows 
the  I-V  characteristics  of  a  95-#im/0.2-/im  device  with 
-  5.5  nm.  Although  fabricated  devices  are  NFD,  and 
long-channel  devices  show  kink  in  their  I-V,  very  short 
devices  have  reduced  kinks  as  seen  in  Fig.  1.  This  is  due  to 
^e  fact  that  the  depletion  regions  of  the  source /drain 
junctions  nearly  deplete  the  film  at  drain  voltages  lower 
than  the  onset  of  the  kink.  Fig.  2  shows  the  subthreshold 
swing  and  threshold  voltage  shift  (AK,)  of  the  fabricated 
PMOSFETs.  Although  ultrathin  film  is  not  used,  good 
subthreshold  swing  and  short-charmel  behavior  is  ob¬ 
served.  Subthreshold  characteristics  of  a  device  with 
•=  0.2  fim  are  shown  in  Fig.  3.  * 

Fig.  4  shows  that  for  r„  -  55  nm,  the  device  with 
of  0.15  fim  has  saturation  transconductances  iC„)  of  274 
mS/mm  at  room  temperature  and  352  mS/mm  at  80  K 
These  are  measured  values  and  have  not  been  corrected 
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Effective  Channel  Length  ( nm) 


Fig.  2.  Threshold  voltage  shift  (At',)  and  subthreshold  swing  (5)  versus 
effective  channel  length  at  -  -0.1  V.  AK  is  the  difference  between 
V,  of  a  long-channel  device  and  t'  of  the  given  device. 


Fig.  3.  Subthreshold  characteristics  of  a  device  with  W/L,g  -  9.5 
fim/02  ^m. 

for  series  resistance  effect.  In  fact  a  key  to  achieved 
transconductance  is  the  relatively  low  series  resistance, 
which  ranges  from  700  to  1100  fl  •  for  our  devices. 
The  G„  of  our  devices  with  =  10  nm  is  higher  than 
those  reported  for  bulk  devices  with  •=  6  nm  and 
=  8  nm  [4],  [5].  Fig.  5  similarly  shows  that  the  of 
present  devices  is  larger  than  recently  reported  values  for 
both  bulk  and  SOI  devices  with  thinner  gate  oxides  [1],  [5], 

There  are  several  reasons  for  SOI  devices  built  on  thin 
silicon  films  to  have  higher  G„  and  /j,„  over  their  bulk 
counterparts.  Fully  depleted  (FD)  and  nearly  fully  de¬ 
pleted  (NFD)  devices  have  reduced  or  no  bulk  charge 
effect  that  raises  the  local  V,  increasingly  toward  the 
drain.  Reduced  bulk  charge  also  reduces  the  local  effec¬ 
tive  vertical  field  and  improves  the  carrier  mobility  [7].  An 
'additional  effect  not  reported  before  is  the  effect  of 
forward  bias  on  the  floating  body  even  before  the  onset  of 
the  kink.  To  demonstrate  this  effect,  special  four-terminal 
devices  with  body  contacts  were  fabricated  on  the  same 
die  as  regular  three-terminal  devices.  Fig.  6  shows  the  J-V- 


Fig  4.  McMured  laiuraooo  tnnsconductanoe  (C^)  veisus  etfeaive 
dtaaitel  length.  -  -  2  V. 


0.1  0l2  OJ  04  OA  as  a?  oj  o.9  ih  1.1  u 
Eflecnve  Chumel  Length  (tun) 


Fig.  5.  Nonnehzed  utunbon  ourent  versus  effective  channel  length. 


Fig.  6.  /-K' characteristics  of  a  four-terminal  device  that  has  body 
oontaa.  Solid  lines  show  I-V  when  body  is  floating.  Qrcles  are  used  for 
the  grounded  body,  and  solid  triangles  are  used  when  -03  V  is  applied 
to  body  contact,  ^ch  step  is  -03  V. 

of  a  four-terminal  device.  Three  sets  of  curves  are  shown 
with  body  floating,  body  grounded,  and  -  0.3  V  (forward 
bias)  applied  to  the  body.  is  clearly  larger  with  the 
fourth  terminal  open  than  grounded.  However,  the  float¬ 
ing-body  case  and  the  forward-bias  case  match  before 
onset  of  the  kink,  indicating  that  a  forward  bias  of  about 
03  V  is  present  when  the  body  is  floating.  Since  the 
body-drain  junction  is  reverse  biased  and  some  leakage 
current  flows  from  drain  to  body,  the  forward  bias  of  the 
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body-source  junction  allows  this  current  to  flow  from 
*  body  to  source.  One  drawback  of  the  forward  bias  is 
reduction  of  threshold  voltage  and  increase  of  leakage 
current  at  1^  =  0,  as  seen  in  Fig.  3. 

rv.  Conclusion 

Using  SIMOX  wafers  with  silicon  film  thickness  of  130 
nm,  PMOS  transistors  with  effective  channel  lengths  down 
to  0.15  fim  are  fabricated.  These  devices  exhibit  excellent 
short-channel  behavior,  low  series  resistance,  and  remark¬ 
able  G„  and  For  =  55  nm,  G„  of  274  mS/mm 
at  300  K  and  352  mS/mm  at  80  K  are  achieved,  which  arc 
the  highest  reported  values  for  this  oxide  thickness.  The 
high  performance  is  attributed  to  low  series  resistance, 
reduction  of  body  charge  effect,  and  the  forward-bias 
body  effect. 
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Observation  of  Velocity  Overshoot  in 
Silicon  Inversion  Layers 

Fariborz  Assaderaghi,  Ping  Keung  Ko,  Member,  IEEE,  and  Chenmmg  Hu,  Fellow,  IEEE 


/42»slrart— Employing  a  novel  test  structure,  electron  velocity 
overshoot  in  silicon  inversion  layers  is  observed  at  room  temper¬ 
ature.  For  channel  lengths  longer  than  OJ  |im,  the  velocity  / 
field  relation  follows  the  well-known  behavior  with  no  channel 
length  dependence.  The  first  indication  of  velocity  overshoot  is 
seen  at  channel  length  of  0.22  jim.  while  at  I  -  0.12  »im  dnft 
velocities  up  to  35%  larger  than  the  long  channel  value  are 
measured. 


I.  iNTRODL'CnON 

AS  MOS  transistor  dimensions  shrink  to  deep-submi- 
crometer  regime,  nonlocal  effects  are  expected  to 
become  more  prominent.  Perhaps  the  most  important  of 
these  nonlocal  effects  is  velocity  overshoot,  which  can 
improve  current  drive  and  transconductance.  Several  au¬ 
thors  have  provided  theoretical  models  (see  [1],  (2)  and 
references  therein).  Recently,  measurement  of  very  high 
transconductance  in  0.1-/i.m  MOSFET^swas  attributed  to 
velocity  overshoot  [3].  This  attribution  was  made  by  com¬ 
paring  the  measured  transconductance  with  Monte  Carlo 
simulation  of  the  reported  device  [4].  Here  we  report 
observation  of  velocity  overshoot,  using  a  special  test 
structure. 

II,  Device  Structure 

As  in  our  previous  work  of  measuring  saturation  veloc¬ 
ity  [5],  we  employ  back-channel  conduction  in  silicon-on- 
insulator  (SOI)  MOSFET’s.  The  SOI  devices  used  in  the 
study  are  built  on  SIMOX  wafers.  A  full  description  of 
process  integration  is  given  in  [6].  Here  we  provide  only 
the  kev  device  parameters.  As  shown  in  Fig.  1,  front-gate 
oxide  thickness  7^,.  silicon  film  thickness  T,„  and  buried 
oxide  thickness  7^,  are  18.  130,  and  400  nm,  respectively^ 
The  doping  concentration  is  approximately  6-8  x  10 
In  the  normal  mode  of  operation  of  these  devices, 
the  inversion  layer  is  formed  at  the  front  Si/SiO^  inter¬ 
face.  However,  it  is  possible  to  form  the  inversion  layer  at 
the  buried  oxide  /  silicon  interface  by  applying  a  very  large 
back-gate  voltage  To  eliminate  conduction  by  the 
front  channel,  negative  front-gate  voltage  is  applied  to 
accumulate  the  front  Si/SiOi  interface.  This  unusual 
structure  and  bias  condition  provide  a  unique  opportunity 
for  observing  velocity  overshoot  as  follows.  For  short- 
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channel  devices  (c.g.,  0.5  /im),  only  a  small  drain  voltage 
(c.g..  I  V)  is  required  to  achieve  a  high  tangential  field. 
Since  the  dram  voltage  Vj  is  much  smaller  than  back-gate 
voltage  (e.g..  70  V).  the  inversion  charge  density  is 
essentially  uniform  in  the  channel  between  source  and 
dram.  Thus  the  tangential  field  is  uniform.  This  is  to  be 
contrasted  with  a  regular  thin-oxide  MOSFET  where  the 
tangential  field  is  nonuniform  and  inaeases  significantly 
from  source  to  dram. 

The  idea  of  utilizing  very  thick  gate  oxides  to  obtain 
uniform  inversion  layers  was  first  tried  in  bulk  MOSFETs 
by  Fang  and  Fowler  [7],  They  used  this  technique  to 
measure  elearon  saturation  velocity  in  inversion  layers 
with  good  accuracy.  However,  if  one  employs  a  submi- 
CTometer  bulk  MOSFET  with  very  thick  gate  oxide,  the 
device  will  suffer  from  punchthrough.  In  the  SOI  MOS¬ 
FET  punchthrough  is  effectively  suppressed  due  to  the 
presence  of  thin  silicon  film. 

111.  Results  and  Discussion 

NMOSFETs  with  W  =  9.5  ixtn  and  channel  lengths 
from  0.6  to  0.12  were  used.  Front-gate  voltage  was  set 
JO  —4  V  to  accumulate  the  front  interface,  while  the 
back -gate  threshold  voltage  was  about  11  V.  Since  F^^ 
was  in  the  range  of  60-100  V  and  F^  was  kept  below  iJ 
V  the  inversion  charge  was  essentially  uniform,  allowing 
us’  to  write  I  -  -  V,)  l\  where  C,,  is  the  buried 

oxide  capacitance,  F,  is  the  back  gate  threshold  voltage,  W 
is  the  channel  width,  and  c  is  the  electron  drift  veloaty. 
Since  v  is  the  only  unknown  in  this  relation,  it  can  be 
determined  from  the  measured  current.  The  tangential 
field  is  given  by  £,.  =  (k^  —  IR^j)/L,  where  (series 
resistance)  is  50-60  fl  for  our  devices. 

Fig.  2  shows  electron  drift  velocity  versus  tangential 
field  for  a  0.47-Mm  device.  Very  good  agreement  with 
Thomber’s  equation  [8]  is  achieved  for  the  usual  choice  of 
P  =  2: 


F(£,.)  =  1  + 


Fo 


E,. 


As  seen  in  this  figure,  the  low  field  mobility  at  F^^  =  50  V 
is  480  cm^/V  •  s  and  decreases  to  390  cm-/V  •  s  at  F^^  = 
90  V,  as  expected.  Not  surprisingly,  velocity  tends  to 
saturate  at  tangential  fields  above  3  X  lO^V/cm,  and  it 
does  not  show  strong  dependence  on  the  vertical  field. 
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Fig.  I.  Schematic  cross  section  of  an  SOI  MOSFET.  By  applying  a 
large  positive  voltage  to  the  back  gate,  the  inversion  layer  is  formed  at 
the  back  Si/SiO-  interface 


Fig.  2.  Measured  electron  drift  velocity  versus  tangential  field  for  a 
device  with  L  =  0.47  ^m.  Vertical  field  is  used  as  a  parameter. 

Fig.  3  shows  the  results  of  similar  measurements  on 
devices  with  different  channel  lengths.  t  (£,)  for  channel 
lengths  in  the  range  of  0.6-0.35  nearly  overlap,  but 
for  shorter  channel  lengths  the  high  field  velocity  stans  to 
increase  with  decreasing  channel  length.  Clearly  for  L  < 
0.25  fim.  the  drift  velocity  exceeds  the  saturation  velocity 
of  long-channel  structures.  For  example,  zl  L  =  0.12  /xm 
drift  velocities  up  to  359f  larger  than  the  saturation 
velocity  are  observed.  It  should  be  noted  that  the  concept 
of  uniform  charge,  electric  field,  and  drift  velocity  that  we 
used  to  derive  the  velocity/field  relationship  is  only  valid 
for  long-channel  devices  (i.e,,  L  >  0.32  /im).  For  very 
short  devices  (where  overshoot  is  not  negligible),  the 
velocity  is  not  constant  in  the  channel,  and  the  values 
reported  here  should  be  treated  as  "average”  drift  veloci¬ 
ties. 

Fig.  4  shows  the  average  drift  velocity  as  a  function  of 
channel  length,  with  the  tangential  field  as  a  parameter. 
For  I  >  0.25  ^m  (e.g.,  L  =  0.32  ^m),  as  tangential  field 


Fig  Mfavured  rleciron  dnfi  velocity  versus  applied  tangential  field. 
Devivt  channel  length  is  used  as  a  parameter  fj,  -  Ij,  -  I'  -  60  V. 


Fig  4  Average  electron  drift  velocity  versus  channel  length,  \sith 
tangential  field  used  as  a  parameter. 

increases  the  drift  velocity  increases  but  tends  to  saturate 
for  larger  fields.  This  is  to  be  contrasted  with  the  L  = 
0.12-gtm  device,  which  shows  no  clear  velocity  saturation 
even  at  £,.  =  8  x  10“  V/cm.  Moreover,  for  relatively 
moderate  fields  (e.g.,  1  x  10“  V/cm).  the  measured  veloc¬ 
ities  for  all  different  channel  lengths  are  about  the  same 
and  no  significant  overshoot  is  observed.  This  is  obviously 
not  the  case  for  larger  fields. 

One  complicating  factor  in  the  above  measurements  is 
that  for  very  shon  channel  lengths,  the  threshold  voltage 
becomes  dependent  on  the  drain  voltage.  Fig.  5  shows 
characteristics  of  the  0.12-p,m  device,  which  repre¬ 
sents  the  worst  case  of  K,  reduction.  We  took  into  ac¬ 
count  the  Vj  dependence  of  threshold  voltage,  by  measur¬ 
ing  current  shifts  at  different  drain  voltages. 

IV.  Conclusion 

Novel  SOI  structures  are  utilized  to  study  the  phe¬ 
nomenon  of  velocity  overshoot.  Velocity  overshoot  is  ob¬ 
served  at  room  temperature  for  channel  lengths  as  long  as 
0.22  /xm.  At  0.12  /xm,  drift  velocities  up  to  35%  larger 
than  the  saturation  velocity  are  measured. 
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Fig.  5 


Dram  current  of  the  L  -  0.12-Mm  device  plotted  as  a  funct«>n 
of  back-gate  soltage.  Drain  voltage  is  used  as  a  parameter 
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ROOM  TEMPERATURE  OBSERVATION  OF  \T:L0CITY  OVERSHOOT 

IN  SILICON  INVERSION  LAYERS 

Fariborz  Assaderaghi.  Ping  Keung  Ko.  and  Chenming  Hu 
Department  of  Electrical  Engineering  and  Computer  Sciences 
U.  C.  Berkeley,  Berkeley  CA  94720 

As  MOS  transistor  dimensions  shrink  to  deep  sub-micron  regime,  the  non-local  effects  are 
expected  to  become  more  prominent.  Perhaps  the  most  important  of  these  non-local  effects  is  velocity 
overshoot,  which  can  be  beneficial  to  device  performance  by  improving  current  drive  and  transconduc¬ 
tance.  Here,  for  the  first  time,  we  report  direct  observation  of  velocity  overshoot  using  a  special  test 
structure.  The  first  indication  of  velocity  overshoot  is  seen  at  channel  length  of  0.22p.m,  while  at 
L|.ff=0.12p.m  drift  velocity  values  up  to  40^  higher  than  the  long  channel  value  are  measured. 

As  in  our  previous  work  of  measuring  saturation  velocity  [  1  ],  we  employ  back  channel  conduction 
in  silicon-on-insulator  (SOI)  MOSFET's.  The  SOI  devices  used  in  the  snidy  are  built  on  SIMOX  wafers 
[2].  As  shown  in  Fig.l,  the  front  gate  oxide  thickness  (Tfg),  silicon  film  thickness  (T,,),  and  buried  oxide 
thickness  (T^g)  are  18nm,  130nm,  and  400nm,  respectively.  The  film  doping  concentration  is  approxi¬ 
mately  6-8xl0'^  cm'^  The  inversion  layer  is  formed  at  the  buried  oxide/silicon  interface  by  applying  a 
very  large  back  gate  voltage  To  eliminate  conduction  by  the  front  channel,  negative  front  gate  voltage 
is  applied  to  accumulate  the  front  Si/Si02  interface.  This  unusual  bias  condition  provides  a  unique  oppor¬ 
tunity  for  observ'ing  velocity  overshoot  as  follows. 

NMOSFET’s  with  channel  lengths  from  0.6pm  to  0.12pm  were  used.  Since  back-gate  voltage  V^g 
was  in  the  range  of  60-100V  and  was  kept  below  1.5V,  the  inversion  charge  was  essentially  uniform 
between  source  and  drain  [3],  allowing  us  to  write:  l=CbgW(V(,g-V,)v.  Where  C^g  is  the  buried  oxide 
capacitance,  Vj  is  the  back  gate  threshold  voltage,  W  is  the  channel  width,  and  v  is  the  electron  drift 
%'elocity.  Since  v  is  the  only  unknown  in  this  relation,  it  can  be  determined  from  the  measured  current.  The 
tangential  field  is  simply:  Ey=(Vjj-IRsd)^eff-  measured  devices  W=9.5pm  and  making 

the  correction  term  IRjd  small,  around  0.1  V.  Using  above  relations,  in  Fig.2  drift  velocity  is  plotted  versus 
tangential  field  for  a  0.47pm  device.  This  velocity/field  data  is  in  perfect  agreement  with  the  well  known 
Thomber’s  equation  [4].  Moreover,  velocity  tends  to  saturate  at  tangential  fields  above  3xl0'^V/cm,  and  it 
does  not  show  strong  dependence  on  the  vertical  field,  as  expected. 

Fig. 3  shows  the  results  of  similar  measurements  on  devices  with  different  channel  lengths.  Drift 
velocities  (v(Ey))  for  channel  lengths  longer  than  0.32pm  nearly  overlap.  However,  for  channel  lengths 
shorter  than  0.32pm  this  overlap  disappears,  and  the  high-field  velocity  starts  to  increase  with  decreasing 
channel  length.  Clearly,  for  L  <  0.25pm,  the  drift  velocity  exceeds  the  saturation  velocity  of  long  channel 
structures.  For  example,  at  L=0.12pm  drift  velocities  up  to  407o  larger  than  samration  velocity  are 
observed.  In  fact,  at  this  channel  length  the  drift  velocity  shows  no  clear  saturation  behavior  even  at  Ey  = 
IxlO^V/cm.  Fig.4  shows  drift  velocity  as  a  function  of  channel  length,  with  the  tangential  field  as  a 
parameter.  For  low  fields  the  measured  velocities  for  all  different  channel  lengths  are  about  the  same,  and 
no  significant  overshoot  is  observed.  This  is  obviously  not  the  case  for  larger  fields. 

It  should  be  noted  that  the  idea  of  uniform  charge,  field,  and  velocity  that  we  utilized  to  derive 
velocity/field  relationship  is  only  valid  for  long  channel  devices.  For  very  short  channels,  the  velocity  is 
not  constant  in  the  channel  (due  to  velocity  overshoot)  and  the  measured  value  is  an  average  velocity. 
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Fig.l  Schematic  cross-section  of  an  SOI  MOSFET.  The 
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Fig.  3  Average  electron  velocity  versus  applied  tang¬ 
ential  field,  with  device  channel  length  as  a  parameter. 


Fig.  2  Measured  electron  drift  velocity  versus  applied 
tangential  field.  Vertical  field  is  used  as  a  parameter. 
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Fig.  4  Average  electron  drift  velocity  versus  channel 
length,  with  tangential  field  used  as  a  parameter. 
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ABSTRACT 

Sub-quaner  micrometer  PMOSFET's  are  fabricated  on  SOI 
films,  exhibiting  excellent  short  channel  behavior,  low  source- 
drain  resistance,  and  remarkably  large  current  drive  and 
transconduciance.  For  To,=5.Snm,  saturation  transcooductances 
of  270mS/mm  at  300K  and  350mS/mm  at  80K  are  ach  eved. 
which  are  the  highest  reported  values  for  this  oxide  thickness. 
Direct  measurements  and  simulation  results  show  that  the 
improved  current  drive  is  due  to  low  series  resistance,  forward 
bias  body  effect,  and  the  reduction  of  body  charge  effect 

INTRODUCTION 

Process  simplicity  and  other  advantages  such  as  reduced 
parasitic  capacitance  and  improved  shon  channel  effect  have 
led  to  the  development  of  silicon-on-insulator  (SOI)  MOS- 
FET’s.  It  has  also  been  shown  that  long  channel  SOI  MOS- 
FET’s  have  larger  current  drive  than  bulk  MOSFETs. 
However,  current  drive  advantage  of  deep  sub-micron  SOI 
MOSFET's  over  bulk  devices  has  not  been  demonstrated.  Often 
high  series  resistance  has  obscured  this  advantage  [1].  Here,  for 
the  first  time  we  repon  experimental  results  for  deep  sub¬ 
micron  SOI  PMOSFET's  with  improved  performance  over  their 
bulk  counterparts. 

SILICON  FILM  THICKNESS  CONSIDERATION 

It  is  difficult  to  achieve  a  large  enough  threshold  voltage  in 
a  polysilicon  gate  fully-depleted  SOI  device  in  ultra-thin  silicon 
film.  In  addition,  the  threshold  voltage  is  very  sensitive  to  film 


thidmeu  vanaboD.  Also,  silicidation  is  necessary  to  reduce  the 
•enes  resistance  Neariy-FuUy-Depletcd  (NFD)  devices  can 
provide  the  SOI  advantages  without  the  above  drawbKks. 

Based  on  thu  consideration.  SIMOX  substrates  with  SOI 
film  thickness  of  ISOnm  were  used.  Mesas  were  created  by 
plasma  etching  a  mtnde/oxide/silicon  stack.  Next,  a  l(X)nm 
oxide  was  grown  on  the  mesa  sidewalls  to  prevent  low-V,  edge 
devices  and  gate  oxide  defecu  at  the  mesa  comers.  Threshold 
implants  were  then  performed,  resulting  in  concentration  of  1- 
3xl0*^cm'^.  Gate  oxides  of  5.5nm  and  lOnm  thickness  were 
grown,  followed  by  the  deposition  of  280nm  of  undoped  poly¬ 
silicon.  The  poly  gate  was  doped  by  JxlO’^cm'^  low  energy 
boron  implant 

The  combination  of  P*-poly  gate  and  silicon  film  thickness 
and  doping  concentranon  resulted  in  NFD  devices  with  the 
threshold  voltage  range  of  -OJV  to  -OJV  Effective  channel 
lengths  as  shon  as  0.08nm  were  obtained  by  O2  “ashing"  of  the 
gate  photoresist  [2]. 

DEVICE  PERFORMANCE 

Fig.  1  shows  the  I-V  characteristics  of  a  9J4m/0.2pm 
device  with  To^aS.Snm.  Although  the  fabricated  devices  are 
NFD.  and  long  channel  devices  show  a  small  kink  in  their  I-V, 
very  shon  channel  devices  have  no  observable  kink  as  seen  in 
Fig.l.  This  is  due  to  the  fact  that  the  depletion  regions  of  the 
source/drain  junctions  effectively  deplete  the  film  at  drain  volt¬ 
ages  lower  than  onset  of  the  kink.  This  is  demonstrated  by 
PISCES  simuUtion  of  a  PMOSFET  with  Leff  =0.2^lm.  As  seen 
in  Fig.2,  the  barrier  potential  at  the  bottom  of  the  silicon  film  is 


lowered,  reducing  the  number  of  electrons  that  can  accumulate 
in  this  potentia]  well. 


Drain  Voltage  (V) 


Fig.  1  l-VcharactensticsofanNFDPMOSFET.  No 
observable  kink  is  present. 


Fig.  2  PISCES  simulation  of  maximum  barrier  potential  at  the 
bottom  of  silicon  film  (at  buriedoxide  interface).  Silicon 
film  thickness  is130nm. 

Fig. 3  shows  the  subthreshold  swing  and  threshold  voltage 
shift  (AVj)  of  the  fabricated  PMOSFET’s.  Although  ultra-  thin 
film  is  not  used,  good  subthreshold  swing  (75-80inV/dec)  and 
excellent  short  channel  behavior  is  obtained.  Ritually  no 


threshold  voltage  shift  is  observed  for  devices  down  to  0,2pm. 
Fig.  4  shows  the  subthreshold  characteristics  of  a  9,5pmA).2|im 
device  with  To,«10nm. 


Fg  3  AVt  and  S  varsus  effective  channel  length  aVj  is 
the  (Sflerence  between  Vj  of  a  long  channel  device  and 
Vj  of  the  given  device  V,jg-'0.1V. 


Fig.  4  Subthreshold  characteristics  of  an  NFD  PMOSFET 
with  L«ff  i0.2pm 


Fig.5  shows  that  for  5  Jnm,  the  device  with  of 
0.15pm  has  Gn,  of  274mS/mm  at  300K  and  352mS/mm  at  80K. 
These  are  measured  values  and  have  not  been  corrected  for 
series  resistance  effect.  In  faa  a  key  to  achieved  transconduc¬ 
tance  is  the  relatively  low  scries  resistance,  which  ranges  fi-om 
700Qpm  to  1  lOOQpm  for  our  devices.  The  Gn,  of  our  devices 
with  Tox=I0nm  is  higher  than  those  reported  for  bulk  devices 


with  Tox=6nn)  and  To,=8nni  [3,4].  Fig.6  similarly  shows  that 
IdU.  of  present  devices  is  larger  than  recently  reported  values 
for  both  bulk  and  SOI  devices  with  thinner  gate  oxides  [I,4,S]. 


Fig  5  Measured  saturation  transconductance  versus 
effective  channel  length 


Effective  Channel  Length  (p  m) 

Fig.  6  Measured  saturation  current  (l(jsat)  versus 
effective  channel  length  =  Vgj  -  Vj « -1 .5V. 


There  are  several  reasons  for  SOI  devices  built  on  thin  sili¬ 
con  films  to  have  higher  Gm  and  over  their  bulk  counter¬ 
parts.  Fully-depleted  (FD)  and  Nearly-Fully-Depleted  (NFD) 
devices  have  reduced  or  no  bulk  charge  effect  that  raises  the 
local  increasingly  toward  the  draia  Reduced  bulk  charge 
also  reduces  the  local  effective  vertical  field  and  improves  the 
carrier  mobility  [6].  Simulation  of  this  effect  is  shown  for  a 


0  Jpm  PMOSFET  m  Fig  7.  As  teen  the  SOI  device  has  a  larger 
current  drive  than  the  bulk  MOSFET.  The  bulk  charge  effect, 
however,  becomes  a  smaller  factor  for  shorter  devices.  Fig.8 
shows  that  the  improvement  at  SOI  current  drive  over  bulk  due 
to  this  compooent  disappears  at  Leff  *4).  1pm. 


Ro  7  PISCES  tmritton  of  body  charge  effect  for  a 
OSpm  PMOSFET  The  SOI  device  has  larger  current 
drive  due  to  abeence  of  buk  charge  effect. 


Fig.  8  PISCES  simulation  of  improvement  in  SOI  current 
drive  over  bulk  due  to  reduction  of  body  charge  efffect. 
Two  film  thicknesses  are  simulated. 


An  additional  effect  not  reported  before  is  the  effect  of  for¬ 
ward  bias  on  the  floating  body  even  before  the  onset  of  the  kink. 
Fig. 9  demonstrates  this  with  a  four  terminal  SOI  device  that  has 
a  body  contact.  is  clearly  larger  with  the  fourth  terminal 
(body  contact)  open  than  grounded.  In  Fig.  10,  I-V  curves  for 
the  floating  body  case  are  compared  with  the  case  where  -0.3V 
is  applied  to  the  body  contact  This  two  set  of  curves  match 
before  the  kink,  indicating  that  a  forward  bias  of  about  0.3V  is 
present  when  the  body  is  floating  (and  before  onset  of  the  kink). 
At  this  body  voltage  the  drain  to  body  reverse  leakage  current  is 
equal  to  the  body  to  source  forward  current  as  shown  in  Hg  1 1 . 


Fig.  9  Circles  show  I-V  when  body  is  grrounded.  Solid 
lines  show  I-V  when  body  is  floating.  Vg  steps  are  -O.SV, 


Fig.  10  Circles  show  I-V  when -0.3V  is  applied  to  the  body. 
Lines  show  I-V  when  body  is  floating.  Vg  steps  are  -O.SV  each. 
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The  deviation  of  the  two  set  of  curves  in  Fig.  10  —after  the 
kink-  is  due  to  the  amplification  of  substrate  current  by  the  par- 
asiDc  lateral  BJT  (for  the  floating  body  case).  Current  gain  of 
the  lateral  BJT  can  be  significant  as  seen  in  Fig.  12.  where 
above  PMOSFET  is  operated  as  a  PNP  transistor  with  the  body 
contact  used  as  ibe  base  cootao. 


Fig.  12  Afourtemiinal  SOI  PMOSFET  operated  as  a  PNP 
transistor.  Current  gain  >  30. 


SUMMARY 


Nearly-FuUy-Depletcd  (NFD)  PMOSFET’s  with  effective 
channel  lengths  down  to  O.lSjim  were  fabricated  on  130nm 
thick  SOI  films.  These  devices  exhibit  excellent  shon  channel 
behavior,  low  series  resistance,  and  remarkable  Gq,  and 
For  Tox=  5,5nm,  Gm  of  270mS/mm  at  300K  and  350tnS/mra  at 
80K  are  achieved,  which  are  the  highest  reported  values  for  this 
oxide  thickness.  This  result  is  attributed  to  low  series  resistance, 
forward  bias  body  effect,  and  the  reduction  of  body  charge 
effect. 
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Abstract 

A  new  mode  of  operation  for  Silicon-On-Insulator  (SOI)  MOSFET  is  experimentally 
investigated.  This  mode  gives  rise  to  a  Variable  Threshold  voltage  MOSFET  (VTMOS).  VTMOS 
threshold  voltage  drops  as  gate  voltage  is  raised,  resulting  in  a  much  higher  current  drive  than 
regular  MOSFET  at  low  V^jj.  On  the  other  hand,  Vj  is  high  at  Vg5=0,  thus  the  leakage  current  is 
low.  Suitability  of  this  device  for  ultra  low  voltage  operation  is  demonstrated  by  ring  oscillator 
performance  down  to  Vjj  =  0.5V. 
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Introduction 


During  the  past  few  years  demand  for  low  power  and  high  performance  digital  systems  has 
grown  rapidly.  The  mam  approach  for  reducmg  power  has  relied  on  power  supply  scaling.  Smce 
power  supply  reduction  below  3V(  will  degrade  circuit  speed  significantly,  scaling  of  power 
supply  should  be  accompanied  by  threshold  voltage  reduction.  However,  the  lower  limit  for 
threshold  voltage  is  set  by  the  amount  of  off-state  leakage  current  that  can  be  tolerated  (due  to 
standb\  power  consideration  in  static  circuits,  and  avoidance  of  failure  m  dynamic  circuits  and 
memory  arrays).  To  extend  the  lower  bound  of  power  supply,  we  propose  a  Variable  Threshold 
voltage  MOSFET  fVTMOS)  with  the  highest  V,  at  zero  bias  and  the  lowest  value  at  Vgs=V(ij.  In 
the  remainder  of  this  paper  we  will  describe  the  operation  of  the  device,  and  show  its  superiority 
over  a  regular  MOSFET.  We  will  also  show  some  circuit  performances  using  VTMOS. 

Experiment  and  Results 


The  SOI  devices  used  in  the  study  are  built  on  SIMOX  wafers.  Mesa  active  islands 
(MESA)  were  created  by  plasma-etching  a  nitride/oxide/silicon  stack  stopping  at  buried  oxide.  P-t- 
polysilicon  gate  was  used  for  PMOSFETs  and  N-t-  for  NMOSFETs.  A  four  terminal  layout  was 
used  to  proN'ide  separate  source,  drain,  gate,  and  body  contacts.  In  addition  to  the  four-terminal 
layout,  devices  with  local  gate-to-body  connections  were  also  fabricated  as  illustrated  in  Fig.  1. 
This  connection  uses  an  oversized  metal  to  P-i-  contact  window  aligned  over  a  "hole"  in  the  poly 
gate  1 1  ].  The  metal  shorts  the  gate  and  P-i-  region.  Thus,  there  is  no  significant  penalty  in  area. 

To  operate  the  VTMOS,  floating  body  and  gate  of  a  Silicon-On-Insulator  (SOI)  MOSFET 
are  tied  together.  This  is  not  a  new  configuration,  as  [1-3]  have  already  suggested  it.  However,  [1- 
3]  all  tned  to  exploit  the  extra  current  produced  by  the  lateral  bipolar  transistor.  This  normally 
requires  the  body  voltage  to  be  larger  than  0.6V.  Since  current  gain  of  the  bipolar  device  is  small, 
extra  drain  (collector)  current  comes  at  cost  of  excessive  input  (base)  current,  which  contributes  to 
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the  standby  current.  We  will  show  that  most  of  the  improvement  can  be  achieved  when  gate  and 
body  voltages  are  kept  below  0.6V.  This  also  ensures  that  base  current  will  stay  negligible. 
Although  the  same  idea  can  be  used  in  bulk  devices,  better  advantage  is  reached  in  SOI,  where 
because  of  very  small  junction  areas  base  current  and  capacitances  are  appreciably  reduced. 

Fig.  2  illustrates  the  NMOS  behavior,  with  a  separate  terminal  used  to  control  the  body 
voltage.  The  threshold  voltage  at  zero  body  bias  is  denoted  by  \\q.  Body  bias  effect  is  normally 
studied  in  the  reverse  bias  regime,  where  threshold  voltage  tncrea.ses  as  body  to  source  reverse  bias 
is  made  larger.  We  propose  to  use  the  exact  opposite  regime  Namely,  we  "forward  bias"  the  body- 
source  junction  (at  less  than  0.6V),  forcing  the  threshold  voltage  to  drop 

Specifically,  this  forward  bias  effect  is  achieved  by  connecting  the  gate  to  the  body.  This  is 
shown  as  Vg5=V^,5  line  in  Fig.  2.  The  intersect  of  Vj  curve  and  VgjsVjj^  line  determines  the  point 
where  gate  and  threshold  voltages  become  identical.  This  point,  which  is  marked  as  Vjf,  is  the 
VTMOS  threshold  voltage.  This  lower  threshold  voltage  does  not  come  at  expense  of  higher  off- 
state  leakage  current,  because  at  Vjj5=Vg5=0  VTMOS  and  regular  device  have  the  same  Vj.  In 
fact,  they  are  identical  in  all  respects  and  consequently  have  the  same  leakage.  This  is  clearly  seen 
in  Fig.  3.  Reduced  Vjf  compared  to  V(q  is  attained  through  a  theoretically  ideal  subthreshold 
swing  of  60mV/dec.  Fig.  3  demonstrates  this  for  PMOS  and  NMOS  devices  operated  in  VTMOS 
mode  and  in  regular  mode.  Subthreshold  swing  is  80mV/dec  in  the  regular  devices. 

This  is  not  the  only  improvement.  As  the  gate  of  VTMOS  is  raised  above  V^f,  threshold 
voltage  drops  further.  The  threshold  voltage  reduction  continues  until  Vg^siV^j  reaches  2«l>b,  and 
threshold  voltage  reaches  its  minimum  value  of  V{^n,jn=2<Db-t-Vf^.  For  example,  for  technology-B 
in  Fig.  2,  at  Vg5=Vj35=0.6V,  V(=0.18V  compared  to  V{q=0.4V.  In  VTMOS  operation  the  upper 
bound  for  applied  Vg5=Vjj5  is  set  by  the  amount  of  base  current  that  can  be  tolerated.  This  is 
illustrated  in  Fig.  3,  where  PMOS  and  NMOS  device  body  (base)  currents  are  shown.  At 
Vg5.=0.6V  base  currents  for  both  PMOS  and  NMOS  devices  are  less  than  2nA/pm.  A  further 
advantage  of  VTMOS  is  that  its  carrier  mobility  is  expected  to  be  higher  because  the  depletion 
charge  is  reduced  and  the  effective  normal  field  in  the  channel  is  lowered  [4]. 
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Current  drives  of  VTMOS  and  regular  MOSFET  are  compared  in  Fig.  4,  for  technology-B 

of  Fig.  2.  VTMOS  drain  current  is  2.5  times  of  regular  device  at  Vg5=0.6V,  and  5.5  times  of 

regular  device  at  Vg5=0.3V.  AC  performance  of  VTMOS  is  evaluated  by  an  unloaded  101  stage 

CMOS  nng  oscillator.  Fig.  5  plots  the  delay  of  each  stage  versus  power  supply.  We  emphasize 

that  since  the  threshold  voltages  of  devices  used  in  the  nng  oscillator  were  high  (technology-A), 

the  optimum  performance  was  not  achieved.  For  technology-B.  nng  oscillators  are  not  available.  If 

the  devices  based  on  technology-B  are  used,  the  expected  delay  for  unloaded  ring  oscillator  can  be 

C  1  1 

calculated  by  the  following  equation  [5):  - >  ’This  is  shown  as  the 

^  'diatn  ‘dsatp 

dashed  line  in  Fig.  5,  where  C=200fF  is  used  for  Wn=5nm  and  Wp=10pm  This  value  for  C  was 
obtained  by  fitting  the  equation  to  the  measured  of  technology-A. 


Conclusion 

For  low  power  operation  at  very  low  voltage,  a  MOSFET  should  ideally  have  a  high  V{  at 
Vg5=0  to  achieve  low  leakage  and  low  V^  at  Vgj=V£y  to  achieve  high  speed.  By  tying  body  and 
gate  of  an  SOI  MOSFET  together,  a  variable  threshold  voltage  MOSFET  (VTMOS)  is  obtained. 
This  device  has  ideal  60mV/dec  subthreshold  swing.  VTMOS  threshold  voltage  drops  as  gate 
voltace  is  raised,  resulting  in  much  higher  current  drive  than  regular  MOSFET.  VTMOS  is  ideal 
for  very  low  voltage  (<  0.6V)  operation,  as  demonstrated  by  ring  oscillator  data.  VTMOS  also 
solves  the  floating  body  problems  of  SOI  MOSFET  such  as  kinks  and  Vf  stability.  Furthermore, 
carrier  mobility  is  enhanced. 
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Figure  Captions 


Fig.  1  a)  Cross  section  of  an  SOI  NMOSreX  with  body  and  gate  tied  together,  b)  Gate 
to  body  connection  by  using  aluminum  to  short  the  gate  and  P+  region. 

Fig.  2  Threshold  Voltage  of  SOI  NMOSFET  as  a  function  of  body-source  forward  bias. 
For  Technology-A  TQx=10nm,  N3=2.0xI0'^cm'-^  For  Technology-B  Tox=6.4nm 
Na=2.3xlOl'7cm-3. 

Fig.  3  Subthreshold  characteristics  of  SOI  NMOSFET  and  PMOSFET  operated  with 
bod\  grounded  and  body  tied  to  the  gate.  Body  to  source  currents  are  also  shown  for  the 
case  of  VTMOS  (body  tied  to  the  gate). 

Fig.  4  Drain  current  of  an  SOI  NMOSFET  operated  as  a  VTMOS  and  as  a  regular 

device. 

Fig.  5  Delay  of  a  101-stage  ring  oscillator.  The  PMOS  and  NMOS  devices  in  the  ring  are 
VTMOS  with  Tox=10nm,  and  Leff=0.3)im,  V(q=0.6V.  The  dashed  line  is  prediction  of 
dela>'  for  a  ring  oscillator  based  on  Technology-B  with  Le|^=0.3)im. 


6 


Gate  To  Body  Contact  Cross-Section 
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INTRODUCTION  . 

CMOS  technology  built  on  SOI  substrate  has  many  advantages  over  its  bulk 
counterpan.  The  most  often  quoted  arc  higher  current  drive,  reduced  parasitic 
capacitance,  better  device  isolation  and  thus  no  latch  up,  and  superior  shon-channe 
behaviors.  It  is  also  a  simpler  technology  to  develop  and  implement  in  the  dccp-submicron 
regime.  However,  the  inherent  floating  body  effect  of  SOI  MOSFETs  generates  many 
problems,  the  most  famous  of  which  is  the  "kink"  effect.  A  fully-depleted  (FD)  device 
reduces  the  kink,  but,  as  this  paper  shows,  docs  not  solve  many  of  the  problems  that  limit 
its  applicability,  especially  in  analog  circuits.  We  propose  and  demonstrate  that  a  new 
simple  Low-Barrier  Body-Contact  (LBBC)  technology  eliminates  these  problems. 
FABRICATION  PROCESS 

Fig.  1  shows  a  NMOSFET  with  the  new  LBBC  structure.  Structure  of  the 
PMOSFET  is  similar.  The  fabrication  process  is  a  modified  version  of  the  conventional 
CMOS  SOI  process  discribed  in  [1].  Only  key  steps  of  the  NMOS  process  will  be  given 
here.  After  the  gate  definition  step,  a  deep  boron  implant  was  performed  with  dose  1x10*^ 
cm-2  and  energy  60keV.  This  formed  a  moderately  doped  P  region  close  to  the  buried 
oxide.  Then  shallow  source/drain  arsenic  implant  was  performed  with  dose  SxlO’^cm-^ 
and  energy  25keV,  followed  by  a  shallow  boron  implant,  which  is  the  source/drain  implant 
for  PMOS  in  a  CMOS  process.  A  special  implant  mask  was  used  to  utilize  this 
implantation  step  to  give  a  P-t-  region  right  next  to  die  N+  source  of  the  NMOSFET.  This 
P-t-  region  is  butted  together  with  the  N+  source  with  the  same  contact.  The  underlying  P 
region  formed  by  the  tailored  boron  implant,  either  neutral  or  depleted,  provides  a  low- 
barrier  path  for  the  holes  generated  by  impact  ionization  to  be  collected  through  the  butted 
P-t-  region.  Ploeg  proposed  a  conceptually  similar  dual  source  structure  [2].  However  die 
scheme  depended  on  the  A1  spiking  phenomenon,  which  is  sensitive  to  process  variation 
and  incompatible  with  VLSI  junction  and  contact  techology  (for  example,  silicidation). 
DEVICE  PERFORMANCE 

The  I-V  characteristics  of  NMOSFETs  and  PMOSFETs  fabricated  with  different 
technologies  are  shown  in  Fig.  2  and  3  respectively.  The  LBBC  MOSraTs  exhibit  higher 
breakdown  voltage  especially  at  low  Vg,  and  a  very  constant  I^sat  which  is  free  of  ^nk. 
This  shows  that  the  LBBC  is  very  effective  in  collecting  the  substrate  current.  Fig.  4 


shows  the  collection  efficiency  of  the  LBBC  compared  with  the  bulk  MOSFETs 
fabricated  in  the  same  lot.  At  low  the  LBBC  is  capable  of  collecting  the  same 
amount  of  substrate  current  as  that  of  the  bulk.  This  substrate  current  collection  scheme 
is  in  fact  much  more  effective  than  the  normal  side- body  contact  scheme  that  sacrifices 
significant  area. 

The  output  resistance  (Rout)  of  MOSFETs  with  LBBC  is  compared  with 
conventional  kink  free  FD  MOSFETs  (Fig.  5).  The  FD  Structure  essentially  softened  the 
kink,  but  did  not  actually  eliminate  it.  So  the  resulting  output  resistance  is  very  low, 
especially  at  low  Vg  due  to  threshold  reduction  as  a  result  of  DIBL  and/or  impact 
ionization  (Fig.  7  and  8).  As  shown  in  figure  5.  the  Rqui  can  be  improved  by  at  least  one 
order  of  magnitude  by  using  the  LBBC  technology.  The  negative  resistance  due  to  self¬ 
heating  is  suppressed  by  using  a  thick  Si  film  (O.lb^im)  and  thin  buried  oxide  (0.1  Ip-m). 
Fig.  6  shows  the  voltage  gain  of  LBBC  MOSFETs  and  conventional  FD  MOSFETs. 
Much  higher  gain,  which  is  important  for  analog  applications,  can  be  obtained  with  the 
LBBC  technology. 

The  subthreshold  characteristics  of  a  0.2pm  LBBC  and  FD  MOSFETs  are  shown 
in  Fig.  7  and  8.  The  abnormally  high  subthreshold  slope  due  to  charging  of  floating  body 
by  impact  ionization  at  high  drain  voltages  [3]  is  completely  removed.  Thus,  a  much 
lower  off-state  leakage  current  and  better  gate  control  of  drain  current  can  be  achieved  for 
both  PMOSF^Ts  and  NMOSFETs. 

The  flicker  noise  characteristics  of  the  LBBC  MOSFETs,  Bulk  MOSFETs  and 
FD  MOSFETs,  another  important  consideration  for  analog  application,  are  shown  in  Fig. 
9  and  10.  The  LBBC  MOSFETs  shows  a  much  lower  flicker  noise.  In  conventional  FD 
MOSFETs,  the  floating  body  cannot  sink  any  junction  leakage  and  substrate  current 
caused  by  hot-carrier  effects,  which  can  result  in  fluctuation  of  surface  potential,  which 
in  turn  modulates  the  channel  carrier  density.  With  the  LBBC  technology,  the  extra 
current  can  be  sunk  resulting  in  bulk  like  flicker  noise  level. 

Fig.  11  and  12  compare  the  threshold  voltage  drop  (AVj)  and  subthreshold  swing 
(S)  shift  due  to  short  channel  effect  between  the  LBBC  and  FD  MOSFETs.  The  slight 
random  variation  in  V-j-  in  the  FD  SOI  is  caused  by  variation  of  silicon  film  thickness 
which  is  not  acceptable  in  low  voltage  digital  circuits  or  high  precision  analog  circuits. 
As  can  be  seen,  MOSFETs  with  the  LBBC  structure  show  an  improved  shon  channel 
behavior  over  conventional  FD  SOI  MOSFETs. 

CONCLUSION 

The  LBBC  structure  has  been  developed  which  can  greatly  improve  SOI 
MOSFETs  performance  for  digital  and  analog  applications.  The  process  only  require  2 
extra  masks  for  a  CMOS  process  and  an  insignificant  amount  of  extra  area.  The  LBBC  is 
the  most  effective  substrate  currect  collection  scheme  reported. 
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Fig,  1;  NMOSFETwiththe  Low-Bamer  Body-Contact 
The  narrow  P  low-Barrier  path  was  formed  by 
a  tailored  10  “cm  ’  boron  implant  ot  source/ 
droin  implant  step.  PMOSFcT  fabricated  r^os 
similar  parameters 


Fig.  2:  l-V  characteristics  of  NMOSFETs  fabricated 
with  different  technologies 


Fig.  3:  FV  characteristics  of  PMOSFETs  fabricated 
with  different  technologies 


Fig  A  Comporsonofsubstrote  current  collection 
effectiveness  between  bulk  MOSFETs  and  SOI 
MOSFETs  wim  Low-Borrier  Body-Contact 


Fig.  5;  Componson  of  output  resistance  between 
SOI  with  Low-Barrier  Body-Contact  and 
fully-depleted  SOI 
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Fig.  6:  Comparison  of  single  transistor  small  signal 
voltage  gain  between  fully-depleted  SOI  and 
SOI  with  Low-Barrier  Body-Contact 
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Abstract 

A  new  Recess-Channel  SOI  (RCSOI)  Technology  has  been  developed  for  fabricating  ultra-thin  SOI 
MOSFETs  with  low  source/drain  series  resistance.  Thin-film  fully-depleted  SOI  MOSFETs  with 
channel  film  thickness  of  72nm  have  been  fabricated  with  the  RCSOI  technology.  The  new 
structure  demonstrated  a  70%  reduction  in  source/drain  series  resistance  compared  with 
conventional  processes.  In  the  deep-submicron  regime,  more  than  80%  improvement  in  saturation 
drain  current  and  transconductance  over  conventional  devices  was  achieved  using  the  RCSOI 
technology.  The  new  technology  would  also  facilitate  the  use  of  silicide  for  further  reducing  the 
series  resistance. 


INTRODUCTION 


Fully-depleted  (FD)  MOSFETs  fabricated  on  ultra-thin  silicon-on-insulator  (SOI)  films  have 
received  significant  attention  for  integrated  circuit  applications  due  to  reduced  parasitic 
capacitance,  simple  device  isolation,  better  short  channel  behavior  and  radiation  hardness  [1]. 
However,  as  silicon  film  thickness  (t^)  is  reduced,  sourcc/drain  series  resistance  (R^)  of  SOI 
MOSFETs  increases,  which  in  turn  significantly  reduces  the  current  drive  and  speed  response  in  the 
deep-submicron  regime  [2].  Silicide  technology,  which  is  used  to  reduce  in  bulk  MOSFETs,  is 
difficult  to  apply  to  ultra-thin  film  SOI  wafers.  For  example,  Schottky  diode  behavior  has  been 
observed  for  nMOSFETs  in  some  SAlicide  SOI  processes  [3],  most  likely  caused  by  additional 
lateral  migration  of  the  silicide  into  the  sourcc/drain  junction  due  to  the  r^id  lateral  consumption 
of  the  finite  volume  of  silicon.  Also,  high  R^  may  still  result  after  silicidation,  as  reported  in  [4].  In 
this  letter,  we  propose  and  demonstrate  that  a  simple  Recess-Channel  SOI  (RCSOI)  technology  can 
significantly  reduce  of  thin-film  FD  SOI  MOSFETs.  The  process  is  much  simpler  and  more 
robust  than  other  technologies  for  the  same  purpose  using  selective  cpi-silicon  [5]  and  selective 
CVD  Tungsten  [6],  especially  when  tjj  at  the  channel  is  below  lOOnm. 

FABRICATION  PROCESS 

The  fabrication  process  is  a  modified  version  of  the  conventional  CMOS  SOI  MESA 
process  described  in  [7].  The  key  steps  are  shown  in  Fig.  1.  The  starting  wafers  have  silicon  film 
thickness  of  195nm.  A  LOCOS  process  was  performed  at  the  channel  region  forming  180nm  of 
oxide,  which  consumes  85nm  of  silicon.  The  oxide  was  then  wet  etched  giving  a  silicon  film  of 
llOnm  at  the  channel.  Because  it  was  not  a  self-aligned  process,  a  margin  of  0.3|im,  limited  by 
layer-to-layer  registration,  was  given  to  both  sides  of  the  gate.  Mesa  active  islands  (MESA)  were 
created  by  plasma-etching  a  nitride/oxide/silicon  stack  stopping  at  buried  oxide.  Next,  a  lOOnm 
oxide  was  grown  on  the  MESA  sidewalls  to  prevent  low-V^.  edge  devices  and  gate  oxide  defects  at 
the  MESA  comers.  Threshold  implant  was  then  performed,  resulting  in  concentration  of  1- 
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SxlO'^cm'^  in  the  silicon  film.  Gate  oxide  of  8.6nni  was  grown,  followed  by  the  deposition  of 
270nm  of  polysilicon.  Doping  of  the  poly-silicon  gate  was  realized  by  a  25-keV,  SxlO’^cm-^  boron, 
and  50keV,  SxlO'^cm'^  phosphorus  implant  giving  P+  gate  for  pMOSFETs  and  N+  gate  for 
nMOSFETs.  Photoresist-ashing  process  [8]  was  then  performed  to  give  deep-submicron 
transistors  with  effective  channel  length  as  small  as  0.1pm.  Source/drain  implant  with  As  for 
nMOSFETs  and  BFj  for  pMOSFETs  was  done  using  an  implant  energy  of  30-keV  and  a  dose  of 
3xl0'5cm-2.  The  final  cross-section  is  also  shown  in  Fig.  1. 

DEVICE  PERFORMANCE 

The  RCSOI  MOSFETs  demonstrated  here  have  final  silicon  film  thickness  of  165nm  at  the 
source/drain  region  and  72nm  at  the  channel  after  subsequent  oxidation  and  etching.  The  contact- 
to-gate  spacing  is  about  1.5pm.  Table  1  and  Fig.  2  summarizes  the  performance  of  MOSFETs  with 
effective  channel  length  (L^)  of  0.3pm,  fabricated  using  different  technologies.  The  high 
results  from  conventional  process  reduces  the  saturation  drain  current  and  saturation 
transconductance  (gn^jt)  deep-submicron  MOSFETs  significantly.  The  saturation  voltage  (Vjj„) 
in  the  presence  of  high  R^^  is  also  much  higher,  thus  preventing  the  use  of  deep-submicron 
MOSFETs  for  low  voltage  applications.  The  RCSOI  technology  is  capable  of  reducing  the  R^  by  a 
factor  of  3  at  this  channel  film  thickness.  This  factor  is  expected  to  be  larger  when  the  silicon  film 
is  thinner.  80%  improvement  in  1^^^,  and  over  conventional  devices  has  been  achieved  using 
the  RCSOI  technology.  The  of  the  new  devices  also  reduced  by  45%  for  nMOS  and  35%  for 

pMOS.  Lower  is  the  cause  of  lower  breakdown  voltage  in  RCSOI  MOSFET  (Fig.  2).  Note 
that  the  thick-film  SOI  MOSraTs  used  for  comparison  here  are  non-fiilly-depleted  (NFD),  which 
have  lower  intrinsic  I^jg,  and  g^  due  to  the  substrate  charge  effect  [9].  But  its  lower  R^j  makes  its 
performance  comparable  with  the  FD  SOI  MOSFETs. 

The  measured  I^  versus  Leff  of  different  MOSFETs  are  shown  in  Fig.  3.  Decreasing 
channel  length  aggravates  the  reduction  of  I^jj,  (and  similarly,  g^)  in  conventional  SOI  MOSFETs 
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because  the  debiasing  effect  of  Rj^,  becomes  stronger  as  current  increases,  thus  diminishing  the 
advantages  of  scaling.  With  the  RCSOI  technology,  85%  of  the  intrinsic  (assuming  R^  =  0) 
can  be  achieved  versus  less  than  50%  achieved  by  conventional  process  at  below  0.5pm.  The 
impact  of  reduction  in  t^j  on  Rj^  and  is  shown  in  Fig.  4.  Note  that  the  t^  is  the  measured  [10] 
film  thickness  at  the  channel,  not  the  thickness  at  the  sourcc/drain  region.  Due  to  the  re-oxidation 
after  the  source/drain  implant  and  over-etch  in  contact  opening,  the  silicon  film  on  the  source/drain 
region  is  expected  to  be  thinner.  The  very  high  series  resistance  in  42nm  silicon  film  maybe  caused 
by  contact  problem  when  making  contact  to  the  ultra-thin  silicon  film.  Such  contact  problem  can 
be  eliminated  by  the  RCSOI  technology,  thus  arbitrary  thin  SOI  MOSFETs  can  be  fabricated.  Also 
silicide  technology  may  be  used  in  conjunction  with  the  Recess-Channel  technology  to  further 
reduce  the  source/drain  series  resistance.  The  thicker  silicon  film  in  the  source/drain  region 
provides  more  silicon  for  the  formation  of  silicide.  making  silicide  process  much  easier  to  apply. 

The  only  potential  drawback  of  the  RCSOI  technology  is  the  non-self-aligned  nature  of  the 
process  which  may  result  in  asymmetric  devices  characteristics.  However,  this  kind  of  behavior 
was  not  observed  in  our  measurement. 

CONCLUSIONS 

A  new  Recess-Channel  SOI  technology  has  been  developed.  It  significantly  reduces  R^, 
thus  increasing  the  current  drive  and  the  transistor  gain  in  deep^submicron  SOI  MOSFETs.  This 
technology  is  potentially  very  useful  for  fabricating  high  performance  ultra-thin  SOI  MOSFETs 
with  arbitrary  silicon  film  thickness.  The  process  is  compatible  with  most  of  the  existing  CMOS 
processes,  including  silicidation  for  further  reducing  the  source/drain  series  resistance. 
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Figure  Captions 


Fig.  1 ;  Key  steps  of  the  RC  SOI  process  and  the  final  cross-section  of  the  device 

Table  1 :  A  summary  of  device  performance.  The  MOSFETs  demonstrated  have  Tq^  =  8.6nm, 
Leff  =  0.3pm.  and  are  measured  at  -  V,h  =  1.5V  and  « 

measured  at  =  2V.  Subthreshold  swing  (S)  is  measured  at  V(j5=0.1V.  The  film 
thicknesses  are  72nm  for  thin-film,  16Snm  for  thick-film.  The  RCSOI  structure  has 
72nm  at  the  channel  and  1 65nm  at  the  source  drain  region. 

Fig.  2:  I-V  characteristics  of  a  conventional  thin-film  SOI  nMOSFET  and  a  nMOSFET  with 
Recess-Channel  structure. 

Fig.  3:  Measured  I^sat  versus  of  conventional  SOI  MOSFETs  and  MOSFETs  with  RCSOI 
structure. 

Fig.  4:  Experimentally  measured  Rj^j  and  Itjjat  versus  silicon  film  thickness.  The  MOSFETs  used 
have  Lgff  =  0.3pm,  T^^j  =  8.6nm.  I^jjat  is  measured  at  Vp=1.5V  absolute  value.  The  I^saj 
of  RCSOI  MOSFETs  is  also  shown  as  a  comparison. 
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Abstract 

ESD  protection  capability  of  SOI  CMOS  output  buffers 
has  been  studied  with  Human  Body  Model  (HBM)  stresses  of 
both  positive  and  negative  polarity.  Experimental  results  show 
that  the  ESD  discharge  current  is  absorbed  by  the  NMOSFET 
alone.  Unlike  bulk  technologies  where  the  bi-directional  ESD 
failure  voltages  are  limited  by  positive  polarity  stresses.  SOI 
circuits  display  more  serious  reliability  problem  in  handling 
negative  ESD  discharge  current.  Bulk  NMOS  output  buffers 
fabricated  on  the  substrate  of  the  same  SOI  wafers,  after  etching 
away  the  buried  oxide,  have  been  used  to  compare  the  ESD 
protection  capability  between  bulk  and  SOI  technologies.  The 
ESD  voltage  sustained  by  these  "bulk"  NMOS  buffers  is  about 
twice  the  voltage  sustained  by  conventional  SOI  NMOS  buffers. 
This  scheme  is  proposed  as  an  alternative  ESD  protection  for 
SOI  circuits.  The  effectiveness  of  ESD  resistant  design 
strategies  developed  in  bulk-substrate  technologies  when 
transferred  to  SOI  circuits  is  also  discussed  in  this  paper. 

Introduction 

As  VLSI  circuits  are  becoming  more  performance 
driven,  Silicon-On-Insulator  (SOI)  CMOS  technologies  have 
become  very  attractive.  By  dielectrically  isolating  circuit 
elements,  SOI  technologies  eliminate  transistor  latch-up  and 
provide  reduced  junction  capacitance.  Such  reduction  in 
parasitic  capacitance  allows  IC's  to  operate  at  much  higher 
circuit  speeds  than  conventional  bulk-substrate  silicon  IC’s  with 
the  same  device  dimensions.  Because  of  the  better  short- 
channel  behavior,  higher  circuit  density,  and  simpler  fabrication 
process,  SOI  technologies  show  great  potential  to  become  the 
low-cost  mainstream  production  technologies  [1], 

With  the  rapid  advancement  of  SOI  technology, 
electrostatic  discharge  (ESD)  susceptibility  becomes  one  of  the 
major  reliability  issues.  However,  very'  little  attention  has  been 
paid  to  ESD  phenomena  for  SOI  circuits.  In  bulk-substrate 
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technologies,  good  ESD  protection  levels  have  been 
demonstrated  by  using  NMOS/CMOS  output  buffers  (2.3]' 
However,  most  of  the  protection  schemes  developed  for  bulk 
mj\  not  be  compatible  with  SOI  structures.  For  example,  the 
use  of  thick-field-oxide  devices  becomes  impractical  on  SOI 
wafers.  Large-area  low-series-resistance  (vertical)  PN  junctions 
are  not  available  cither,  as  the  silicon  film  is  usually  thinner 
than  ISOnm  Several  ESD  protection  schemes  designed  for  SOI 
circuits  have  been  proposed,  which  use  additional  circuits 
constructed  with  diodes  and  polysilicon  resistors  [4],  These 
solutions  consume  large  silicon  area,  introduce  large  delays,  and 
are  far  from  adequate. 

In  this  paper,  the  ESD  susceptibility  of  submicron  SOI 
NMOS/CMOS  output  buffers  is  studied  with  HBM  stresses.  The 
failure  mechanisms  of  these  buffers  are  investigated  to  provide 
an  understanding  of  SOI  ESD  phenomena.  The  impact  d 
different  design  parameters  such  as  gate-to-contact  spacing, 
silicon  film  thickness  (T^j)  and  effective  channel  width  (\V^)on 
ESD  susceptibility  during  HBM  stresses  are  also  presented.  The 
results  are  compared  with  NMOS  buffers  which  have  similar 
physical  structures  and  fabrication  processes  to  reveal  the 
effectiveness  of  improving  SOI  ESD  performance  by  design 
strategies.  And  finally,  an  alternative  ESD  protection  scheme  of 
fabricating  the  output  buffers  on  the  substrate  of  SOI  wafers  is 
discussed. 

Experimentation 

To  study  the  ESD  phenomena  of  SOI  circuits,  non- 
optimized  deep-submicron  MOS  buffers  with  250^m  effective 
channel  width  have  been  fabricated.  The  layout  utilized  the 
'finger  structure'  as  shown  in  Fig.  1  to  achieve  a  more  uniform 
current  density  [5].  The  silicon  film  thickness  and  the  buried 
oxide  thickness  (Tj^j^)  are  163nm  and  lOOnm  respectively.  AH 
these  transistors  are  fabricated  on  SIMOX  (Separation  by 
IMplanted  OXygen)  wafers  with  MESA  isolation  process  [6].  A 
thin  gate  oxide  of  8nm  is  used  to  suppress  short  channel  effects 
in  the  deep-submicron  regime,  where  SOI  technologies  show 
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'  grior  performance  over  bulk  technologies.  The  effective 
channel  lengths  ranged  from  0.3p.m  to  l.Sum  and  the 


gate 


•to-contact  spacing  is  2pm.  Because  the  breakdown 


characteristics  of  a  NMOSFET  can  be  very  different  with  body 
floating  and  body  grounded,  MOSFETs  with  special  body 
contacts  [7]  are  also  tested  in  the  study. 


voltages  at  each  stress  condition.  A  maximum  leakage  current 
of  lOOnA  for  drain  voltage  ranging  from  -0.5V  to  3V  is  chosen 
as  the  failure  criterion. 

Positive  Polarity  HBM  Discharge  Of  NMOS 
Output  Buffers 


Fig.  1.  Ladder  structure  used  in  NMOS  output 
buffer  Layout 

The  bulk  NMOS  buffers  used  for  comparison  are 
fabricated  on  the  substrate  of  the  same  SOI  wafer  after  etching 
away  the  buried  oxide.  The  structure  is  shown  in  Fig.  2.  This 
allows  us  to  compare  the  ESD  protection  capability  of  SOI  and 
bulk-substrate  output  buffers  with  similar  fabrication  processes 
and  physical  structures. 


During  positive  polarity  HBM  discharge,  NMOSFET 
operating  in  the  bipolar  breakdown/snapback  mode  is  usually 
used  as  a  clamping  device  [9,10].  Fig.  3  and  4  shows  the 
breakdown/snapback  characteristics  of  the  bulk  and  SOI 
NMOSFETs  respectively. 


Fig.  3.  Breakdown/snapback  characteristics  of 
bulk  NMOSFETs  with  different  effective 
channel  lengths 


SOI  MOSFET 


Fig.  2.  Structure  of  a  SOI  NMOSFET  and  a  BULK 
NMOSFET  being  tested 


The  transistors  are  stressed  according  to  the  Human 
Body  Model  (HBM)  as  specified  in  Mil-Std  883C  Method 
3015.7.  Eight  to  Ten  transistors  are  stressed  with  3  pulses  per 
stress  level  to  obtain  a  statistical  distribution  of  ESD  failure 


Fig.  4.  Breakdown/snapback  characteristics  of 
SOI  NMOSFETs  with  different  effective 
channel  lengths.  Measurements  are  done 
for.  both  body  floating  and  body  grounded. 
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A  double  snapbac4c  breakdown  behavior  is  observed  in 
the  bulk  NMOSFETs  and  body-grounded  SOI  NMOSFETs,  and 
this  double  snapback  phenomenon  has  been  claimed  to  give 
reasonable  ESD  protection  [10,11,12],  No  snapback  behavior 
can  be  observed  when  the  body  of  the  SOI  NMOSFET  is  left 
floating  because  the  breakdown  voltage  is  lower  than  the 

snapback  voltage.  The  second  snapback  of  SOI  NMOSFETs 
occurs  at  a  smaller  drain  current  with  lower  holding  voltage 
when  compared  with  the  bulk  case.  Thus,  SOI  NMOS  buffers 
are  more  efficient  in  clamping  ESD  discharge  voltages. 
However,  as  the  second  snapback  is  believed  to  be  caused  by 
current  localization  due  to  the  negative  resistance  coefficient  of 
silicon  beyond  a  critical  temperature  [13,14],  the  lower  second 
snapback  current  also  indicates  more  serious  Joule  heating 
taking  place  in  the  SOI  NMOSFETs.  It  is  reasonable  since  the 
heat  sink  capability  of  silicon  is  much  higher  than  the  heat  sink 
capability  of  Si02,  which  completely  surrounds  the  active  silicon 
island  in  the  SOI  NMOS  buffers.  If  we  take  the  ratio  of  the 
snapback  voltage-current  products  of  bulk  (Fig.  3)  and  SOI  (Fig. 
4)  MOSFET,  as  the  inverse  ratio  of  the  device  thermal 
resistance,  the  SOI  device  thermal  resistance  is  about  2-4  time 
that  of  bulk  device.  The  resistance  in  the  second  breakdown 
regime  (inverse  of  slope)  is  also  larger  for  SOI  devices  (Fig.  4) 
due  to  higher  current  density  than  bulk  devices  (Fig.  3).  This  is 
deleterious  for  ESD  relibility. 

In  actual  positive  polarity  HBM  ESD  discharge,  the 
average  ESD  voltages  sustained  by  SOI  NMOSFETs  are  about 
580V  for  a  250iim  wide  device,  only  half  of  that  sustained  by 
bulk  NMOSFETs  which  average  to  1020V.  The  box  plot  in  Fig. 
5  shows  the  median,  interquartile  ranges,  and  the  extremums  of 
HBM  ESD  voltage  levels  withstood.  No  significant  difference  is 
observed  between  the  body-floating  and  the  body-grounded  case, 
in  agreement  with  the  simlarity  of  holding  voltages  in  Fig.  4. 


Fig.  5.  ESD  failure  voltage  of  bulk  and  SOI 
NMOSFETs  under  positive  ESD  HBM 
stress 


The  failure  mode  for  the  SOI  NMOSFETs  was  found  to 
be  a  short  among  the  gate,  drain  and  substrate  which  can  be 
seen  as  in  Fig.  6.  A  similar  failure  mode  is  observed  in  the  bulk  ■ 
NMOSFET.  The  failures  are  believed  to  be  caused  by  Joule^ 
heating  resulting  in  a  thermal  runaway  condition  during  second 
breakdown  [15].  The  temperature  at  the  drain  junction  is  high 
enough  to  cause  silicon  melting  and  ejection  through  the  thin  1 
gate  oxide  [16],  thus  causing  a  short  between  the  gate,  drain  and  1 
substrate.  I 


Fig.  6.  Example  of  SOI  NMOSFET  ESD  failure  j 
under  positive  ESD  HBM  stress  | 

} 

It  is  interesting  to  observe  that  reducing  the  effective  | 
channel  length  of  the  bulk  NMOSFET  increases  the  mean  and  | 
reduces  the  spread  at  the  lower  end  of  ESD  failure  voltages,  but  | 
it  produces  the  opposite  trend  for  SOI  NMOSFETs.  As  the  | 
amount  of  ESD  stresses  sustained  is  believed  to  be  determined  : 
by  the  silicon  temperature,  the  effect  of  increasing  the  gate  • 
length  for  SOI  NMOSFETs  has,  according  to  the  observed  ESD  < 
test  results,  a  stronger  impact  on  increasing  the  heat  sink  | 
capability,  which  more  than  compensates  for  the  adverse  effect ! 
of  increasing  snapback  holding  voltage  and  series  resistance.  ■ 


Negative  Polarity  HBM  Discharge  Of  NMOS . 
Output  Buffers 

The  failure  voltages  of  SOI  and  bulk  NMOSFETs 
under  negative  HBM  ESD  stress  are  shown  in  Fig.  7.  In  bulk 
NMOSFETs,  the  negative  polarity  HBM  discharge  pulses  ait 
absorbed  by  the  large  drain  to  substrate  forward  biased  diode. 
This  allows  the  transistor  to  sustain  higher  (about  300V  in  our 
case)  negative  discharge  voltage  compared  with  the  case  of 
positive  polarity  discharge.  However,  in  SOI  technologies, 
large-area  vertical  PN  junctions  are  not  available  and  the 
discharge  current  path  is  restricted  to  the  thin  active  silicon  fitol 
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is  usually  thinner  than  150nm.  Due  to  the  reduced  level 
>  of  bipolar  action  during  negative  discharge,  the 

'  ^  HBM  discharge  current  is  clamped  by  the  NMOSFET 

j  y  so-called  transistor-diode  mode.  Since  the 

''  'jgries  resistance  of  the  NMOSFET  in  this  operating  mode  is 
'  >1  ielati''^'y  high,  together  with  the  high  current  density  restricted 
ii'iri  the  thin-film,  serious  local  heating  results  in  a  much  lower 
r  negative  HBM  discharge  current  or  voltage. 
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Fig.  7.  ESD  failure  voltage  of  bulk  and  SOI 
NMOSFETs  under  negative  ESD  HBM 
stress 


Experimental  results  shows  that  the  average  ESD 
discharge  voltage  sustained  by  SOI  NMOSFETs  during  negative 
discharge  is  about  109c  lower  compared  with  the  positive  HBM 
discharge  case  at  Lgf}=0.5pm.  The  main  failure  mode  is  again  a 
short  among  the  gate,  drain  and  substrate  for  both  the  SOI  and 
bulk  NMOSFETs.  But  in  some  SOI  NMOSFETs,  gate  oxide 
rupture  is  observed,  which  is  caused  by  the  potential  difference 
between  drain  and  gate  due  to  the  high  series  resistance.  Again 
no  observable  difference  can  be  found  between  the  body-ground 
and  body-floating  case.  As  the  effective  channel  length 
increases,  the  negative  ESD  voltage  sustained  by  the  SOI 
NMOSFET  decreases  and  more  gate  oxide  rupture  is  observed. 
It  is  not  surprising  since  the  series  resistance  of  the  transistor 
increases  with  channel  length,  resulting  in  a  more  serious 
heating  and  higher  potential  across  the  gate  oxide. 

From  the  above  results,  we  see  that  the  bi-directional 
ESD  HBM  discharge  susceptibility  is  limited  by  negative 
polarity  discharge.  Since  higher  negative  ESD  failure  voltages 
can  be  attained  by  reducing  the  channel  length,  the  gate  length 
should  be  kept  small  to  improve  the  overall  ESD  reliability  of 
SOI  NMOS  buffers. 


ESD  Performance  of  SOI  CMOS 
Output  Buffers 

In  conventional  bulk  CMOS  output  buffers,  positive 
polarity  ESD  HBM  discharge  limits  the  ESD  reliability.  And  it 
has  been  reponed  that  the  parasitic  devices  in  a  bulk  CMOS 
well  process  can  play  a  significant  role  during  an  ESD  event  by 
providing  additional  current  paths  for  positive  stress  current 
with  respect  to  either  Vpp  or  V55  [5,17,18].  These  do  not  apply 
to  SOI  CMOS  technologies  because  no  well  is  available  in  the 
thin  film.  Besides,  the  ESD  HBM  failure  voltage  of  SOI 
NMOSFETs  is  limited  by  negative  polarity  discharge  as 
indicated  in  the  previous  sections.  Since  brcakdown/snapback 
does  not  occur  in  PMOSFET  until  very  high  voltages  (15V  in 
our  case),  the  addition  of  the  PMOSFET  cannot  improve  the 
performance  of  the  N'MOSFET  during  negative  ESD  HBM 
discharge  On  the  other  hand,  because  the  snapback  voltage  of 
N'MOSFET  is  low,  the  PMOSFET  cannot  help  in  the  positive 
ESD  HBM  discharge  either. 

Fig.  8  shows  the  average  ESD  failure  voltages  of  SOI 
CMOS  buffers  with  and  without  p-channel  device  under  both 
positive  and  negative  HBM  stresses.  No  significant  difference 
can  be  observed  whether  the  PMOSFET  is  present  or  not.  This 
confirms  that  ESD  protection  in  SOI  CMOS  buffers  is  provided 
by  the  .NMOSFET  alone 


Fig.  8.  Comparison  of  ESD  performance  of  a 
CMOS  output  buffer  device  with  and 
without  the  p-channel  device.  The 
transistors  under  test  has  dimension  W/L 
=  20iim/1\im  for  the  NMOSFET  and  WA 
=  30[im/1iim  for  the  PMOSFET.  They  all 
have  Tg^  =  7nm,  T^  =  ISOnm,  T^^  = 
400nm.  6  transistors  are  stressed  to 

obtain  the  mean  of  the  distribution. 
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Other  Device  Parameters  Related  to  ESD 
Protection  Capability 

Many  researchers  have  shown  that  increasing  the  gate- 
to-contact  spacing  in  abrupt  junctions  NMOS/CMOS  output 
buffers  can  improve  the  ESD  protection  capability  [2,5,17,19], 
By  varying  the  gate-to-contact  spacing  from  1pm  to  5pm,  the 
average  ESD  failure  voltage  of  bulk  NMOSFETs  under  positive 
HBM  stress  increase  by  about  200V  as  shown  in  Fig.  9  (a).  But 
it  has  no  observable  effect  in  the  positive  HBM  ESD  failure 
voltages  of  SOI  NMOSFETs. 


Gate-to-Contact  Spacing  (pm) 
(a) 


It  can  be  explained  by  the  three  dimensional  nature  of 
bulk  MOSFETs,  in  which  case  the  substrate  is  capable  of 
sinking  the  heat  generated  along  the  current  path  from  the 
contact  to  the  gate.  However,  due  to  the  presence  of  an 
insulating  buried  oxide,  the  heat  generated  in  a  SOI  MOSFET 
can  only  flow  laterally,  resulting  in  roughly  the  same 
temperature  at  the  drain/channel  junction  regardless  of  gate-to- 
coniact  spacings.  Thus  a  similar  ESD  failure  voltage  is 
obtained. 

Furthermore,  increasing  the  gate-to-contact  spacing 
results  in  a  higher  series  resistance  during  negative  discharge, 
which  causes  more  power  dissipation  in  the  transistor.  Thus,  a 
larger  gate-to-contact  spacing  even  lowers  the  negative  ESD 
failure  voltages  of  SOI  MOSFETs  as  shown  in  Fig.  9  (b). 
Therefore,  due  to  bi-directional  ESD  stress  consideration,  gate-  j 
to-conuct  spacing  should  be  kept  small. 

Silicon  film  thickness  is  another  important  design  * 
parameter  in  SOI  circuits  in  determining  the  operating  modes  of  J 
the  transistors  [20]  At  the  same  power  dissipated  in  the  [ 
transistor,  the  silicon  temperature  increases  with  decreasing  | 
silicon  film  thickness  because  of  the  reduction  of  heat  capacity  . 
in  smaller  silicon  volume  [21].  More  serious  local  heating  may 
also  result  because  of  the  higher  series  resistance  and  higher  | 
current  density  confined  in  the  thin  film.  As  a  result,  the  ESD  , 
performance  will  be  worse  as  the  silicon  film  thickness  is  scaled  i 
down.  Fig.  10  shows  the  average  ESD  failure  voltage  as  a  : 
function  of  silicon  film  thickness  which  confirm  the  prediction.  ’ 
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Fig.  9.  ESD  failure  voltage  of  NMOSFETs  with 
different  Gate-to-Contact  spacing  under 
(a)  positive  ESD  HBM  stress,  and  (b) 
negative  ESD  HBM  stress.  The  transistors 
have  Lgff  =  7pm,  Wg„  =  250^m  and  Tg^  = 
Bnm 


Fig.  10.  Mean  and  standard  deviation  of  ESD  j 

failure  voltages  of  SOI  NMOSFETs  verses  ; 

different  silicon  film  thickness  under  both  t 
positive  and  negative  ESD  stress.  The  | 

transistors  have  Lgff  =  7pm,  Wgff  =  20^im  J 

and  Tg^  =  7nm.  Silicon  film  thickness  is  .  j 
measured  by  the  method  described  in  [22]  ->1 
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By  increasing  the  channel  width  of  the  MOSFET,  ESD 
performance  improves  accordingly.  However,  in  SOI  circuits, 
the  ESD  failure  voltage  does  not  increase  as  much  with 
increasing  device  width  compared  with  the  bulk  technologies, 
which  is  illustrated  in  Fig.  11.  This  prevents  the  attainment  of 
good  ESD  protection  by  simply  enlarging  the  device  width  as  is 
routinely  done  for  bulk  IC  ESD  protection. 


(a) 


(b) 


Fig.  11.  ESD  failure  voltage  versus  device  width  for 
NMOSFETs  for  SOI  and  bulk  technologies.  The 
transistors  have  Lgff  =  1p.m  and  =  6nm.  The 
results  are  for  Human  Body  Model  Test  under  (a) 
positive  stress,  and  (b)  negative  stress. 

^n  general,  most  of  the  strategies  used  for  improving 
ESD  performance  in  bulk  technologies  do  not  apply  to  SOI 
circuits.  ESD  protection  schemes  have  to  be  re-optimized  to 
provide  better  SOI  ESD  reliability. 


Conclusion 

In  this  paper,  the  ESD  protection  capability  of  non- 
optimizcd  submicron  ultra-thin  gate  oxide  SOI  and  bulk  CMOS 
buffers  arc  studied  and  compared  using  Human  Body  Model 
stresses.  Results  show  that  the  ESD  voltages  sustained  by  SOI 
NMOS/CMOS  buffers  are  only  about  55%  of  those  achieved  by 
the  bulk  technology.  This  is  mainly  attributed  to  the  poor  heat 
dissipation  due  to  the  insulating  buried-oxide  layer,  causing 
higher  temperature  in  the  silicon  film  during  an  ESD  event. 
Due  to  the  absence  of  large  vertical  PN  diodes,  the  ESD  bi¬ 
directional  stress  is  limited  by  negative  polarity  stress  pulses.  To 
obtain  the  maximum  bi-directional  HBM  ESD  protection  level, 
the  channel  length,  should  be  kept  minimal.  Our  study  also 
shows  that  most  of  the  methods  developed  in  bulk  technologies 
to  improve  ESD  performance  do  not  work  as  well  in  SOI 
circuits,  thus  different  strategics  should  be  investigated. 

As  an  alternative  ESD  protection  scheme,  we  propose 
to  design  ESD  protecuon  circuits  on  the  silicon  substrate 
through  openings  in  the  buried  oxide,  created  by  an  extra 
masking  step  With  the  CMOS  protection  circuits,  one  may 
choose  to  build  both  N’MOSFET  and  PMOSFET  or  only  the 
N'MOSFET  in  the  substrate  (while  keeping  the  PMOSFET  in  the 
Si  film)  in  order  to  simplify  the  process.  As  the  above  results 
show,  this  protection  scheme  is  capable  of  improving  the  ESD 
performance  by  100%  and  allows  most  of  the  ESD  protection 
schemes  developed  for  bulk  technologies  to  be  directly 
u-ansferred  to  the  SOI  technologies. 
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Abstract 

Hub  paper  ia  concerned  with  three-dimensional  aceae  recoaatntclion  alxoritJUBB.  The  goal  is  to  implement 
a  system  in  which  the  video  sequence  obtained  from  movement  of  a  camera  around  a  3-D  object  can  be  used 
to  reconstruct  arbitrary  intermediate  views  not  directly  recorded  by  the  camera.  This  problem  has  been 
solved  for  the  case  in  which  the  object  is  defined  •synthetically*  in  the  computer.  The  main  difference 
between  our  approach  and  the  existing  ones  is  that  we  use  real  objects  and  teal  video  cameras,  rather  than 
mathematically  defined  objects  typically  used  in  computer  graphics  applications.  Our  approach  is  to  “scan" 
a  3-D  object  by  translating  a  camera  across  it,  construct  the  resulting  depth  map,  and  reconstruct  arbitrary 
views  based  on  computing  the  transformation  between  the  camera  locations.  An  interesting  application  of 
this  system  occurs  when  the  shape  of  a  3-D  object  needs  to  be  transmitted  to  a  remote  location.  The  idea 
here  is  to  build  a  3-D  representation  of  the  object  from  its  recorded  video  signature  at  the  transmitter.  At 
the  receiver,  the  viewer’s  head  is  tracked  in  such  a  way  as  to  update  the  view  on  a  stereoscopic  display, 
thus  creating  a  3-D  impression  of  the  object  at  the  recover  end.  We  will  show  simulation  resuIU  on  3-D 
intermediate  view  reconstruction. 


1  Introduction 

With  the  development  of  faster  memory  and  graphics  hardware,  there  has  been  increased  interest  in  the 
development  of  virtual  environments.  Researchers  have  been  investigating  the  representation  of  mathemati¬ 
cally  defined  or  “synthetic”  objects.  The  result  has  led  to  “virtual”  worlds  which  create  a  3-D  impression  of 
the  object  to  the  viewer  as  he/she  moves  around. 

Our  interest  is  to  extend  this  idea  to  real  3-D  objects.  Many  applications  of  such  an  environment 
immediately  come  to  mind:  the  designer  who  wants  to  ^ow  some  prospective  clients  on  the  other  coast  her 
design;  the  real-estate  agent  who  displays  houses  by  having  interested  parties  “walk-through”  a  simulation; 
and  the  surgeon  who  studies  a  3-D  simulation  before  attempting  the  real  surgery.  With  special  hardware 
such  as  a  stereoscopic  display  and  a  head  tracking  device,  the  viewer  is  able  to  gain  the  sense  of  3-D  where 
the  system  updates  the  display  according  to  the  movement  of  the  viewer’s  bead.  Thus,  the  goal  is  to  devise 
a  compact  representation  which  suflBciently  c^tures  the  3'D  information  of  the  real  object. 

There  are  several  possible  approaches  to  this  scene  representation/reconstruction  problem.  Certainly, 
the  easiest  approach  would  be  to  c^ture  a  very  large  number  of  views  of  a  given  object  and  store  these 
images  off-line  in  a  huge  image  database  for  later  reconstruction.  While  this  solution  would  provide  high 
quality  reconstructed  images,  it  is  neither  practical  nor  efiScient,  for  it  requires  large  amounts  of  memory 
and  it  does  not  exploit  the  inherent  3-D  geometry  of  the  scene.  Another  possibility  is  to  model  the  object  by 
mathematical  formulae  and  store  this  reduced  set  of  information.  This  ^proach  saves  in  storage,  but  would 
require  a  complex  analysis  of  the  object,  and  the  accuracy  of  the  reconstructions  can  be  obtained  only  with 
models  having  a  large  number  of  degrees  of  freedom.  Instead,  we  desire  an  approach  that  would  provide 
high  quality  reconstructions  and  yet  would  not  need  a  great  deal  of  memory. 

In  thia  paper,  we  consider  an  approach  where  a  camera  has  scanned  a  given  stationary  object  along 
several  pre-specified  trajectories,  item  each  of  these  sequences,  we  recover  depth  information  at  certain 


locations,  and  use  this  information,  along  with  the  corresponding  intensities,  to  generate  fairly  accurate 
reconstructions. 

In  Section  2,  we  describe  the  algorithms  for  deriving  the  compact  representation  and  scene  reconstruction. 
Section  3  contains  the  experimental  results  on  a  particular  object.  Finally,  we  conclude  in  Section  4  with  a 
discussion  of  our  i^proach. 

2  Description  of  the  Scene  Reconstruction  Algorithms 

In  order  to  construct  a  system  which  enables  users  to  visualise  objects,  two  issues  should  be  addressed: 
the  representation  of  objects  and  their  reconstruction  from  an  arbitrary  view. 

2.1  Derivation  of  the  Compact  Representation 

To  derive  a  compact  representation  of  a  3-D  object,  we  must  first  devise  a  method  lor  acquiring  the  neces¬ 
sary  information.  We  propose  to  capture  several  video  sequences  by  fining  a  camera  along  a  number  of 
trajectories  with  known  geometries.  An  example  of  a  rectangular  fannim  pattern  srith  four  linear  trajec- 
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Figure  1:  An  example  of  ‘aeon"  feometrp  along  four  linear  trajectories  A,  B,  C,  and  D. 

tories  is  shown  in  Figure  1.  In  this  figure,  the  camera  is  assumed  to  translate  across  each  trajectory  A,  B, 
C,  and  D.  We  may  also  scan  along  another  set  of  trajectories  at  a  different  elevation.  By  doing  so,  more 
3-D  information  of  the  object  is  captured.  Note  that  the  second  set  of  trajectories  are  not  necessarily  mere 
translations  along  the  y  direction  of  the  first  set  of  trajectories;  translation  along  the  s-component  may  also 
be  present.  Our  goal  is  to  extract  sufficient  yet  compact  information  from  these  scanned  frames  no  that  we 
may  reconstruct  an  arbitrary  view  of  the  object  anywhere  along  the  trajectories. 

One  possible  representation  of  the  3-D  scene  consists  of  the  depth  and  intensity  at  selected  frames  of 
each  trajectory,  e.g.  at  locations  1,  2,  and  3  of  trajectory  A  in  Figure  1.  These  selected  frames  are  referred 
to  as  reference  frames.  We  believe  that  from  this  set  eA  data,  we  can  reconstruct  intermediate  views  cH  the 
object  at  arbitrary  points  on  the  trajectories.  Assuming  that  the  reference  firames  from  each  trajectory  have 
already  been  selected,  the  steps  for  deriving  the  representation  are  as  follows: 

1.  Derive  an  initial  estimate  of  depth  for  each  frame  relative  to  the  nearest  reference  frame.  We  first  solve 
the  correspondence  problem  by  matching  features  between  each  frame  and  the  nearest  reference  frame. 
There  are  several  approaches  to  address  this  problem;  for  simplicity,  we  choose  to  perfnm  a  simple 
block  matching  search  for  each  feature.  The  depth  is  then  simply  inversely  related  to  the  disparity 
between  matched  pixels,  i.e.  if  Ax  is  the  disparity  between  a  feature  in  the  reference  frame  and  the 
current  frame,  then  the  depth  for  those  points  is  ^proximately  equal  to  1/Az.  Thus,  for  each  frame, 
we  generate  a  frame  of  depths  for  every  pixel,  so-called  depth  map. 

2.  Compute  and  equalize  sealing  factor  between  depth  maps.  In  the  previous  step,  we  find  the  depths  to 
within  a  scale  factor.  It  is  quite  possible  that  the  scale  factors  among  the  depth  maps  all  differ.  Before 
we  may  go  on  to  the  next  step,  we  must  first  determine  the  factor  of  each  depth  m^  and  then  scale 
them  with  respect  to  one  depth  map. 


To  determine  the  scale  factor,  we  find  the  region  of  pixels  with  a  particular  minimum  depth  in  one 
depth  map  and  then  determine  the  depths  of  the  corresponding  points  in  a  second  depth  map.  We  may 
use  the  feature  correspondences  &om  the  first  step  to  aid  in  identifying  the  corresponding  points.  As 
described  in  [1],  we  solve  a  lineu  regression  problem  where  the  depths  of  the  pixels  in  the  first  depth 
map  2]  are  matched  to  those  in  a  second  depth  map  23.  If  a  is  the  scale  factor,  then  the  estimate  for 
a  is  given  by 

.  =  &^ 

EiCJf 

where  the  set  K  consists  of  the  pixels  for  which  depth  is  defined  ia  both  t\  and  I3.  Once  the  factor 
is  determined  for  every  depth  m^,  then  we  scale  each  depth  map  by  its  factor  so  that  they  are  aU 
equalised  with  respect  to  the  same  depth  map. 

3.  Comhi*e  depth  map$  to  etlsia  sccsrsfe  depth  ta/ofmafiaa  at  each  refemee  frame.  The  depth  map 
associated  with  each  reference  frame  is  determined  by  the  neighboring  equalised  depth  maps.  For  every 
point,  we  examine  the  depth  at  the  corresponding  point  in  each  of  the  neighbors,  remove  the  outliers 
which  are  greater  than  one  standard  deviation  from  the  mean,  and  eombine  the  rest  by  weighted  sum 
to  generate  a  single  value.  We  then  have  a  collection  of  accurate  depth  maps  for  each  reference  frame. 

4.  Eetimaie  camera  motion  betioeen  reference  frames.  We  identify  edges  in  the  reference  frames  and  use 
them  to  determine  camera  motion.  For  details  of  the  approach,  see  [1,  3,  3]. 

The  compact  representation  of  the  object  then  consists  of  the  set  of  depth  maps,  intensities,  and  motion 
parameters  for  each  designated  reference  frame. 


2.2  Reconstruction  of  an  Intermediate  View 

Once  we  have  generated  the  compact  representation  for  a  particular  3>D  object,  we  may  choose  to  reconstruct 
an  intermediate  view  of  the  object  at  a  specified  point  on  one  of  the  trajectories.  Assuming  that  the  relative 
position  and  orientation  in  space  of  the  desired  intermediate  frame  are  known,  the  steps  for  reconstruction 
are  as  follows: 

1.  Choose  the  appropriate  reference  frame(s)  to  mse.  F>om  the  relative  position  and  orientation  of  the 
desired  frame,  we  should  decide  which  rrference  frame{s)  to  use.  One  way  ot  deciding  is  to  nse  the 
reference  frames  which  have  the  smallest  motion  parameters  relative  to  the  intermediate  frame. 

Another  consideration  is  the  number  of  reference  frames.  If  the  intermediate  frame  is  very  close  to 
one  of  the  reference  frames  in  the  database,  then  we  may  choose  to  nse  raly  that  reference  frame  for 
reconstruction,  referred  to  as  unilateral  reconstruction.  It  is  also  possible  that  the  particular  view  faUs 
along  a  linear  path  between  two  reference  frames.  In  this  case,  using  both  reference  frames  in  bilateral 
reconstruction  may  be  better.  Finally,  the  view  may  lie  within  a  region  defined  by  four  reference  frames 
as  in  the  case  of  two  pairs  of  reference  frames  at  two  different  elevations;  quadrilateral  reamstruction 
may  be  the  best  choice  in  this  case. 

2.  Generate  estimates  of  the  intermediate  view  hf  applping  motion  parameters  to  each  reference  frame. 
We  are  mwiiming  implicitly  that  the  relative  motion  from  the  intermediate  view  to  each  of  the  chosen 
reference  frames  is  known.  The  notion  of  ^plying  motion  parameters  to  a  frame  has  been  addressed 
in  conventional  computer  vision  literature  [4,  5].  If  (A’l.Fi)  are  the  initial  image  oowdinates,  (.X31F3) 
are  the  final  coordinates  after  motion,  and  z  is  the  depth  at  the  given  point,  then 


X, 

Y, 


(riXx  4- 1*3  Yi  +  ra)2  -b  Az 
(rrXi  +  raYi  +  rpjz  +  Az 
(r^  +  rsYi  +  r6)z  4-  Ay 

(rrJTj  +  r»yi  +  r8)z  +  Az 


where  the  parameters  ri ,  r3, . . . ,  are  rotation  parameters  and  Az,  Ap,  Az  are  translation  parameters 

as  described  in  [3]. 


3.  Combine  the  estimates  of  the  previous  step  from  each  reference  frame  and  interpolate  the  data  to 
the  nearest  grid  points.  Once  we  compute  the  estimates  of  the  desired  view  with  respect  to  each  of 
the  chosen  reference  frames,  we  must  decide  how  to  combine  this  data  to  generate  the  appropriate 
reconstruction.  Furthermore,  it  is  quite  possible  after  the  previous  step  that  the  estimates  do  not 
coincide  with  the  sampling  grid,  so  we  must  also  ensure  that  the  data  is  interpolated  to  the  grid  points. 

We  propose  the  following  technique  for  interpolation.  Suppose  there  are  N  reference  frames  for  recon¬ 
struction  and  suppose  that  there  exists  at  least  one  data  point  near  the  given  pixel  (i,j)-  Then,  the 
intensity  value  of  the  pixel  on  the  sampling  grid  is  given  by 


where  Aij  is  a  2A  x  2A  region  centered  around  pixel  (i,j),  A  is  the  spacing  on  the  sampling  grid,  d*. 
is  the  distance  of  the  hnfh  pc^t  in  frame  n  with  intensity  7*.  from  the  (i.y)  pixel,  and  tCn  ••  the  weight 
for  the  data  from  reference  frame  n.  Generally,  these  weights  depend  on  the  location  with  respect  to 
the  reference  frames. 

It  is  possible  that  there  existe  no  points  in  a  given  2A  x  2A  region,  creating  what  we  refer  to  as  “holes.” 
This  condition  occurs  for  de-occluded  regions,  that  is,  areas  which  axe  uncovered  after  the  occluding 
object  moves.  In  this  case,  we  grow  the  area  A^j  out  to  a  2mA  x  2mA  region,  where  m  is  the  smallest 
value  for  which  a  point  falls  within  the  area  Aij ,  i.e.  the  region  is  no  longer  a  hole.  Once  we  find  such 
an  area,  we  then  use  the  above  interpolation  formula  to  find  the  intensity  value  at  the  gnd  point  (i,j). 

Another  refinement  which  improves  reconstruction  is  to  consider  depth  when  interpolating.  When  an 
object  moves  in  a  frame,  we  would  like  the  pixels  of  the  object  to  have  mote  weight  than  those  from  the 
background  occupying  the  same  region.  One  possible  solution  is  to  place  more  value  on  those  pixels 
with  smaller  depth  (closer)  and  less  on  those  with  larger  depth  (farther  away).  The  result  is  that  pixels 
with  more  motion  will  dominate  over  those  that  tend  to  be  stationary. 


3  Experimental  Results 


We  shall  now  examine  some  results  using  the  techniques  described  above.  To  generate  these  results, 
we  scan  an  orange-juice  container  along  four  linear  trajectories.  The  result  consists  of  a  40-frame  sequence 
per  trajectory.  A  typical  frame,  e.g.  frame  #8  from  view  2,  is  shown  in  Figure  2.a.  For  each  of  the  four 
sequences,  frames  8,  20,  and  32  have  been  chosen  to  be  the  reference  frames. 

Following  the  algorithm  outlined  in  Section  2.1,  we  derive  the  depth  maps  for  each  of  the  twelve  reference 
frames.  An  example  of  the  depth  m^  corresponding  to  frame  #8  is  shown  in  Figure  2.b.  The  depth  map 
has  been  heavily  quantized  to  enable  visualization.  As  can  be  seen,  the  area  corresponding  to  the  container 
is  generally  lighter  than  the  background  indicating  that  it  is  closer  to  the  camera  than  any  other  object 
in  the  scene.  The  depth  maps  associated  with  the  other  reference  frames  are  aiiiular.  It  is  interesting  to 
note  certain  patches  of  the  background  appear  to  be  light  in  color.  This  is  because  of  the  mismatches 
associated  with  solving  correspondence  for  areas  of  constant  intensity. 

Uung  the  depths  as  well  as  the  intensities  of  the  reference  frames,  we  reconstruct  the  views  along  the 
four  trajectories  according  to  the  algorithm  in  Section  2.2  for  bilinear  reconstruction.  For  analysis,  the  worst 
reconstructions  from  two  of  the  four  trajectories  are  shown  in  Figure  3.  Note  that  a  measure  of  erm  is 
not  included  since  the  original  sequence  might  not  contain  the  exact  frames  corresponding  to  the  motion 
parameters  used  to  generate  the  intermediate  view. 

Figure  3.a  is  the  intermediate  view  40%  and  60%  of  the  translational  motion  between  references  frame 
8  and  20,  respectively,  for  trajectory  2.  We  observe  that  the  overall  quality  of  the  image  is  good,  due  to 
the  very  accurate  depth  map  for  frame  8  shown  in  Figure  2.b.  Some  err(»s  occur  along  the  left  edge  of  the 
container.  The  reason  is  that  the  depth  map  c(»responding  to  frame  8  gives  points  immediately  to  the  left  of 
the  box,  which  should  be  background,  more  depth  than  the  background,  causing  the  resulting  interpolated 
image  to  be  erroneous.  This  can  also  be  attributed  to  not  de-emphasizing  enough  the  intensities  of  points 
from  the  stationary  background  compared  with  those  in  the  moving  cont^er.  In  our  algorithm,  the  depth 


is  used  to  detennine  which  pixels  to  de-cmphasixe;  it  appears  that  the  depth  information  may  be  too  noisy 

for  this  purpose.  «  ^  ^  r  * 

The  reconstruction  in  Figure  3.b  shows  the  view  40%  from  frame  8  for  trajectory  3.  Moat  of  the  artifKU 
occur  in  the  left  part  of  the  stool  and  the  upper-left  portion  of  the  orange  juice  container.  We  beheve  that 
a  lot  of  spurious  matches  near  these  constant-intensity  regions  cause  the  depth  maps  to  be  inaccurate,  and 
thus  lowering  the  quality  of  the  reconstructions. 


4  Discussion 


We  have  proposed  an  ^proach  for  representing  and  nooosAructing  Aationary  S-D  objet^. 
stmctions  in  the  last  section  seem  to  indicate  that  this  approach  is  very  promising.  Many  the  arti^ 
occur  at  the  boundaries  of  the  objecU  in  the  scene,  and  they  stem  primanly  from  inaccurate  depths  at  these 
points.  The  block-matching  technique  we  employ  is  simple  yet  reasonably  adequate  to  solw  the  eorres|^ 
dence  problem,  however  it  is  not  the  most  optimum.  Problems  occur  in  the  background  when  there  is  btUe 
movement.  Other  techniques  such  as  hierarchical  searches  and  gradient  methods  may  be  ^le  to  improve 
the  results.  Another  interesting  approach  is  to  estimate  motion  and  depth  simultaneously 

Future  work  in  this  area  includes  examining  the  optimum  number  of  reference  frames  to  fi^y  ca^ure  an 
object.  A  more  complete  analysis  must  be  performed  in  order  to  determine  what  the  scope  c^as^e 
frame  is,  or  conversely,  what  the  optimum  set  of  reference  frames  to  compactly  represent  »  “• 

In  addition  a  real-time  implementation  of  the  reconstruction  algorithm  would  expedite  the  development  of  a 
virtual  environment.  Using  a  stereoscopic  display  and  head  tracking  device,  we  wiU  be  able  to  simulaU  s^ 
a  system  by  reconstructing  an  arbitrary  view  of  an  object  in  real  time  as  the  user  his/her  head.  The 

area  of  scene  reconstruction  and  its  application  to  virtual  environmenU  seems  very  fertile  and  this  research 
serves  a  good  starting  point. 


Acknowledgments 

This  work  was  supported  by  NSF-PYI  grant  MIP-0057466,  ONR  Young  invesUgator  award  N00014-92- 
J.1732,  Joint  Services  Electronics  Program  (JSEP)  contract  F49620-90-0029,  and  Sun  Microsystems. 


References 

[1]  A.  Zakhor  and  F.  Lari,  “Edge-based  3-D  camera  motion  estimation  with  application  to  video  coding," 
in  Moiion  Analysis  and  Image  Sequence  Processing  (M.  I.  Seian  and  R.  L.  LagencUjk,  ed.),  ch.  4,  Kluwer 
Academic  Publishers,  1993. 

[2]  A  Zakhor  and  F.  Lari,  “3-D  camera  motion  estimation  with  applications  to  video  compression 

scene  reconstruction,”  in  Proceedings  of  the  SPIE:  Image  and  Video  Processing,  vol.  1903,  San  Josc^CA, 
3-4  Feb.  1993. 

[3]  R.  Y  Tsai  and  T  S.  Huang,  “Uniqueness  and  estimation  of  three-dinwaisional  motion  parameters  of 
rigid  objects  with  curved  surfaces,”  IEEE  Trans.  Pattern  Anal.  Machine  Intel!.,  vol.  PAMI-0,  no.  1, 
pp.  13-26,  Jan.  1984. 

[4]  B.  K.  P.  Horn,  Robot  Vision.  Cambridge,  MA:  MIT  Press,  1991. 

[5]  J.  Weng,  N.  Ahuja,  T.  S.  Huang,  “Motion  and  structure  from  point  correspondences  with  citot  estimation: 
Planar  surfaces,”  IEEE  Trans.  Sig.  Proc.,  vol.  39,  no.  12,  pp.  2891-2717,  Dec.  1991. 

161  P  Anandan,  J.  R.  Bergen,  K.  J.  Hanna,  and  R.  Hingorani,  “Hierarchical  model-based  motion  estimation,” 
in  Motion  Analysis  and  Image  Sequence  Processing  (M.  I.  Sezan  and  R.  L.  Lagendijk,  ed.),  ch.  1,  Kluwer 
Academic  Publishers,  1993. 


(a)  (b) 

Figure  3:  Typical  nconsineted  frames  along  the  trajectories,  (a)  40%  between  8  and  tO  (view  2);  (h)  40% 
between  8  and  20  (view  S). 
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An  IC  Chip  of  Chua’s  Circuit  * 


Jose  M.  Cruz  aud  Leou  0.  Cbua^ 


Abstmct —  This  paper  reports  a  working  microelectronic  chip  implemen¬ 
tation  of  Chua’s  circuit.  This  chip  has  been  designed  and  fabricated  using 
a  2  /im  CMOS  technology,  with  the  circuit  itself  occupying  a  silicon  area  of 
2^mvi  X  2.8mm.  The  chip  needs  to  be  powered  with  a  single  9-V  battery,  is 
autonomous,  and  generates  the  three  state  variables  of  Chua’s  circuit.  The 
proper  operation  of  this  chip  has  been  confirmed  by  experimental  repro¬ 
duction  of  bifurcation  and  chaotic  phenomena.  This  microelectronic  design 
of  Chua’s  circuit  can  be  employed  as  a  basic  component  in  the  VLSI  syn¬ 
thesis  of  complex  circuits  making  use  of  chaotic  signals,  including  a  class  of 
cellular  neural  networks  and  secure  communication  systems. 

1  Introduction 

Electronic  circuits  exhibiting  well-understood  bifurcation  and  chaotic  behavior  can  be 
exploited  as  basic  components  of  emerging  classes  of  complex  dynamic  electronic  net¬ 
works  and  systems,  including  cellular  neural  networks  exhibiting  spatially  chaotic  dy¬ 
namics  and  secure  communication  systems  based  on  chaos  synchronization. 

Chua’s  circuit  [l]-[9]  is  the  simplest  autonomous  circuit  which  can  exhibit  bifur¬ 
cation  and  chaos.  It  has  been  studied  extensively  and  is  one  the  very  few  circuits  in 
which  a  formal  proof  of  the  existence  of  chaos  has  been  accomplished  [5].  Moreover,  the 
theoretical  and  simulated  behavior  of  this  circuit  can  be  accurately  reproduced  experi- 
mentall.v.  These  factors  have  made  of  Chua’s  circuit  a  tool  for  studying  and  generating 
chaos,  and  is  being  used  as  a  building  block  for  developing  other  more  complex  circuits 
exploiting  chaotic  and  bifurcation  phenomena  [10]  [11]. 

Several  physical  implementations  of  the  circuit  have  been  proposed  since  1985  [6]  [7] 
[8],  They  use  discrete  components  to  implement  the  linear  elements  and  a  combination 
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of  op  amps,  resistors,  diodes,  or  discrete  bipolar  transistors  to  implement  the  nonlinear 
element  (Cihiia’s  diode).  Recently,  monolithic  CMOS  implementations  of  the  Chua’s 
diode  [2]  and  the  Chua’s  circuit  [1]  have  been  fabricated. 

In  this  paper  we  report  the  experimental  results  and  the  implementation  details  of 
a  microelectronic  chip  implementing  Chua’s  circuit.  The  linear  resistor  R  is  the  only 
element  implemented  externally,  by  a  potentiometer,  to  allow  the  setting  of  a  bifurca¬ 
tion  parameter.  This  chip  has  been  designed  and  fabricated  using  a  2  double-metal 
double-poly  CMOS  technology  [15].  The  three  linear  storage  elements  are  implemented 
with  double-poly  capacitors,  one  of  them  used  to  emulate  the  inductor  [13].  The  res¬ 
onant  frequency  of  the  active  LC  circuit  is  approximately  160  KHz.  This  8-pin,  au¬ 
tonomous  chip  is  powered  by  a  single  9-V  bias  battery,  and  generates  three  output 
signals  representing  the  state  variables  of  Chua's  circuit 

The  outline  of  this  paper  is  as  follows.  Section  II  gives  the  chip  electrical  speci¬ 
fications  and  its  experimental  performance.  It  shows  the  three  projections  of  the  ex¬ 
perimental  double-scroll  Chua’s  attractor  and  the  experimental  bifurcation  sequences 
obtained  by  modifying  two  independent  parameters.  Section  111  gives  the  interna]  struc¬ 
ture  of  the  chip  and  detail  the  design  procedure  for  a  CMOS  technology.  Section  IV 
presents  a  numerical  simulation  of  the  chip.  Finally,  Section  V  gives  some  concluding 
remarks  and  applications  of  the  new  chip. 

2  Chip  Specifications  and  Experimental  Performance 

2.1  Parameters  of  Chua’s  circuit  chip 

Chua’s  circuit,  shown  in  Figure  1,  is  a  third-order  circuit.  The  three  variables  are  the 
voltages  across  capacitors  C\  and  and  the  current  through  L.  They  are  denoted  as 
I’c,  1  vcq  and  ii,  respectively;  and  their  dynamics  are  given  by: 

=  ^(^C2  -  VC,)-  ff(vc,  ) 

=  -^(vc,  -  VC2) -i- tL  (1) 

=  -VC2 

where  g(vc, )  is  the  function  given  in  Figure  1(b).  Inside  the  range  (-E-i^E-i)  of  vc, ,  in 
which  the  circuit  normally  operates,  this  function  is  given  by 

9{vc,)  =  -niivc,  +  -  toi)()ucj  +  -  -Ej])  (2) 


dvc2 

j  dii 
^  dt 
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A  particular  Chua’s  circuit  is  characterized  by  seven  parameters  denoted  as  (C\, 
('2,  L,  R.  E],  771],  m2).  They  represent,  respectively,  the  values  of  the  two  linear 
cajiacitors.  C\  and  C2,  the  linear  inductor,  L,  the  linear  resistance,  R,  and  finally  the 
first  breakpoint,  and  the  inner  slopes,  m\  and  mi,  of  the  Chua's  diode  driving-point 
characteristic. 

The  1(1  chip  reported  in  this  paper  implements  a  Chua’s  circuit  with  the  seven 
parameter  values  given  in  Table  I.  Five  of  these  parameters  have  fixed  values  (T  j. 
('2.  £■],  7)1]  and  7712),  and  the  two  others  (L  and  R)  are  variable.  This  allows  us  to 
imi)lement  with  our  chip  a  two-dimensional  parameter  spare  of  possible  Chua’s  circuits. 


Table  1.  Parameters  of  the  chip 


Parameter 

Value 

Unit 

Cl 

150 

PF 

C2 

2000 

PF 

L 

0.33 

mH 

R 

1750 

fi 

nil 

-0.41 

mAjV 

m2 

-0.78 

mAjV 

El 

0.7 

V 

It  is  shown  in  [5]  how  by  proper  norm2Jization  of  the  three  state  variables,  V'c, ,  V'c, 
and  Ii,  and  of  the  time  scale,  the  set  of  Chua’s  circuits  with  different  dynainics  can  be 
specified  with  only  four  parameters  (q,  P,  a,  6),  instead  of  seven.  However,  the  control 
of  R  and  L  values  still  give  us  access  to  a  two-dimensional  parameter  space  of  Chua’s 
circuits.  In  particular  varying  R  leads  to  the  variation  of  a  combination  of  /?,  a  and  b, 
while  varying  L  leads  to  the  independent  variation  of  p. 

2.2  External  description  of  the  chip 

Figure  2  shows  a  photograph  of  the  IC  chip  of  Chua’s  circuit.  The  package  is  an  8- 
pin  DIP,  0.3  inches  wide  and  0.1  inches  interlead.  It  can  be  plugged  into  standard 
breadboards  or  op  amp  sockets.  The  output  pins  of  the  chip  are  defined  in  Table  II. 
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Table  II.  Output  pins  of  the  chip 


pin  no. 

name 

description 

1 

v-i 

output  signal  rr, 

2 

V2 

output  signal  vcj 

3 

output  signal  r^Ii 

4 

positive  terminal  for  bias 

5 

V- 

negative  terminal  for  bias 

6 

control^ 

terminal  for  L  tuning  (optional  use) 

/ 

contToli 

terminal  for  L  tuning  (optional  use) 

8 

VGND 

output  ground  reference 

The  chip  is  autonomous,  and  therefore  does  not  require  any  input  signal.  The  bias 
is  provided  by  a  single  9-V  battery  connected  between  terminals  v+  and  u_.  To  set  the 
two  independent  bifurcation  parameters  we  use  potentiometers  R  and  Rp  connected  as 
shown  in  Figure  3.^ 

The  chip  generates  three  output  signals  representing  the  three  state  variables  of 
Chua's  circuit.  For  convenience,  the  three  signals  are  provided  as  three  voltages,  uj,  vj 
and  U3,  referenced  to  a  common  ground,  vgnd-  The  outputs  v\  and  V2  are  the  voltage 
across  capacitors  C\  and  C2  in  Volts.  The  output  ^3  is  a  voltage  proportional  to  the 
current  though  the  inductor  ^  ,  according  to 


V3  =  rdh 


(3) 


The  nominal  value  of  the  proportionality  constant,  rj,  is  -500^. 

2.3  Experimental  results 

Using  the  above  IC  chip  of  Chua’s  circuit,  powered  by  a  9V-battery,  we  have  generated 
chaos  and  bifurcation  phenomena. 


'The  potentiometer  Rg  is  used  only  as  a  convenient  means  of  changing  the  bifurcation  parameter  0 
[5]  in  the  experimental  results  presented  in  this  paper.  However,  this  potentiometer  R/s  is  not  necessary 
for  the  operation  of  the  chip.  An  alternative  way  to  change  0,  if  desired,  is  by  applying  directly  a  voltage 
bias  reference  to  pin  controh,  which  controk  the  value  of  the  internal  voltage-controlled-inductor.  This 
later  approach,  of  using  electronic  control  of  the  internal  L,  may  be  more  convenient  for  those  using 
this  chip  as  part  of  a  larger  electronic  system. 

^In  previous  experimental  implementations  of  Chua’s  circuit  thk  third  signal,  representing  the 
current  though  the  inductor,  has  been  generated  by  measuring  the  voltage  drop  in  a  small  resistor 
connected  in  series  with  the  inductor,  an  approach  which  may  distort  the  dynamic  behavior  of  the 
circuit.  In  our  implementation,  as  we  will  demonstrate  in  Section  III,  the  signal  V3  is  generated 
without  introducing  any  artifact  affecting  the  dynamics  of  Chua’s  circuit. 
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2.3.1  Chaos  generator 

Using  the  nominal  values  of  R  =  175011  and  =  'iOA'll  the  chip  works  as  a  generator 
of  chaotic  signals.  The  chip  operates  in  the  double-scroll  region.  Figure  4  shows  the 
experimental  time  waveforms  of  the  three  output  state  variables  t’l,  V2  and  vj.  Figure 
5  shows  the  three  experimental  Lissajous  figures.  They  represent  the  projection  of  the 
chaotic  strange  attractor  onto  the  (vj,  t’a),  (r^,  rj ).  and  (rj,  1*3)  planes. 

Figure  6(a)  shows  a  photograph  with  a  magnified  detail  of  the  central  part  of  the 
(T’l,  t’a)  projection  of  the  double-scroll  Chua's  attractor.  It  corresponds  to  the  same 
conditions  of  the  lower  left  photograph  of  the  previous  figure.  Figure  6(b)  shows  a 
further  magnification  of  the  region.  It  is  possible  to  distinguish  some  of  the  individual 
trajectories  that  thickly  fill  the  outer  surface  of  the  attractor  during  the  l/S.s  of  the 
time-exposure  photograph.  In  the  center  of  this  photograph,  a  perspective  of  the 
trajectories  going  along  the  inner  part  of  the  spiral  cylinder  ran  also  be  observed. 

Note  that  for  the  same  parameters  there  is  another  possible  solution,  that  is  a  large 
limit  cycle  associated  with  the  outer  slopes  of  the  nonlinear  element.  However,  this 
solution  can  only  be  observed  if  the  initial  conditions  for  the  state  variables  is  forced 
to  be  far  away  from  the  origin,  outside  the  basin  of  attraction  of  the  double-scroll 
(Ihua’s  attractor.  Normally,  when  the  chip  is  powered,  the  state  variables  are  inside 
the  basin  of  attraction  of  the  chaotic  attractor,  because,  due  to  current  leakage,  the 
storage  capacitors  are  initially  discharged.  Figure  7  shows  a  photograph  superimposing 
the  two  solutions. 

2.3.2  R  Bifurcation  sequence 

The  weU-known  bifurcation  sequence  obtained  by  decreaising  the  resistor  vadue  R  has 
been  experimentally  reproduced.  As  an  example.  Figure  8,  shows  the  experimented 
Lissajous  figures  in  the  (uj,  U3)  plane.  Rp  is  kept  constant  at  the  nominal  value  of 
20 ATI.  As  R  is  decreaised  from  20500  to  15680  we  observe  periodic  behavior  emerg¬ 
ing  from  a  stable  equilibrium  point,  then  a  period-doubling  sequence,  a  spiral  Chua’s 
attractor  and  finally  a  double-scroll  Chua’s  attractor.  As  R  is  decreased  further  the 
double-scroll  Chua’s  attractor  shrinks  in  size  and  its  central  region  becames  thinner. 
As  R  decreases  below  15680,  the  double-scroD  Chua’s  attractor  and  the  saddle- type 
periodic  orbit  collide  with  each  other.  At  that  point  the  only  solution  is  the  large  limit 
cycle  determined  by  the  outer  segments  of  the  Chua’s  diode  characteristic. 


5 


2.3.3  Q  Bifurcation  sequence 


This  chip  adso  allows  independent  change  of  the  bifurcation  parameter  /3  [5].  Using 
the  configuration  of  figure  3,  the  parameter  /?  can  be  increased  by  decreasing  the  value 
of  the  potentiometer  R^.  As  an  example,  Figure  9.  shows  the  experimental  Lissajous 
figures  in  the  (uj,  U] )  plane.  R  is  kept  constant  at  the  nominal  value  of  17500.  As  Rd  is 
decreased  from  26KQ.  to  ISKQ.  we  observe  as  before  periodic  behavior  emerging  from 
a  stable  equilibrium  point,  then  a  period-doubling  sequence,  a  spiral  Chua's  attractor 
and  finaUy  a  double-scroll  Chua’s  attractor.  Observe  that  now  the  chaotic  attractor 
does  not  decrease  in  size  as  we  change  the  bifurcation  parameter. 


3  Internal  structure  of  the  chip 

The  chip  described  in  the  previous  section  has  been  implemented  using  a  CMOS  pro¬ 
cess.  In  this  section,  we  present  the  internal  structure  of  the  chip,  and  describe  the 
design  procedure  used.  This  design  procedure  can  be  used  for  the  VLSI  implementation 
of  Chua's  circuit  in  a  different  technology  or  with  different  parameters.  This  section 
is  structured  as  follows.  In  the  first  part  we  give  the  internal  structure  of  the  chip  at 
the  network  element  level.  Then,  we  detail  the  design  of  ecah  of  these  elements  at  the 
transistor  level  for  a  CMOS  technology.  Finally,  we  present  the  physical  structure  of 
the  entire  chip. 


3.1  Internal  Network  level  structure 

Figure  10  gives  the  network  schematic  of  the  Chua’s  circuit  implemented  in  the  chip. 
It  contains  the  nonlinear  2-terminaI  Chua’s  diode  Npt,  three  capacitors  Ci,  C'j  and  C'3, 
and  and  a  gyrator  G  with  admittance  matrix  given  by 


Yg  = 


0  9d 
-9c  0 


(4) 


The  gyrator  terminated  at  its  right  port  by  capacitor  C’3  looks  like  an  inductor  of  value 


L=—C3 

9c9d 


(5) 


at  its  left  port. 

This  circuit  has  3  nodes.  The  voltage  at  these  nodes,  denoted  by  v\,  V2  and  i;3 
are  available  at  the  chip  output.  In  spite  of  the  fact  that  we  have  introduced  an  extra 
node,  with  respect  to  the  circuit  of  Figure  1,  we  have  not  introduced  any  other  state 
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variable  into  the  circuit.  The  two  new  circuit  variables  introduced  in  the  new  circuit, 
?':5  and  ?c’3  have  values  determined  by 


9<i 

‘Cj  =  9c^i 


Therefore,  the  circuits  in  Figure  1  and  10  are  equivalent.  The  latter  is  more  suit¬ 
able  for  V'LSl  implementation.  Besides,  the  availability  of  the  third  state  variable  as 
a  voltage  allows  us  to  experimentally  measure  this  variable  without  introducing  any 
measuring  circuitry  that  could  modify  the  dynamics  of  ('hua's  circuit. 

The  nonbnear  resistor  and  the  gvrator  are  active  network  elements  and  therefore 
need  to  be  biased.  VVe  use  a  bias  scheme  in  which  only  one  external  battery  is  used. 
Figure  11  shows  the  entire  circuit  including  the  bias.  The  floating  external  battery 
is  connected  to  terminals  t’+(pin  4)  and  t’_(pin  5).  These  terminals  are  internally 
connected  to  the  positive  and  negative  supply  of  the  Chua’s  diode,  of  the  gyrator,  and 
of  the  bias  circuit  shown  at  the  right  of  the  figure.  This  bias  circuit  generates,  at  its 
low  resistance  output,  a  signal  vcsD  (ptn  voltage  value  in  the  middle  of  the 

values  at  the  t’+  and  t’-  nodes.  This  voltage  vcsD  is  considered  to  be  our  ground. 

3.2  CMOS  implementation 

Figure  12  shows  the  architecture  of  the  entire  circuit  using  CMOS  amplifiers  and  capac¬ 
itors.  The  two  operational  transconductance  amplifiers  A  and  B,  in  positive  feedback 
configuration,  implement  the  Chua’s  diode  [2];  the  two  back-to-back  transconductance 
ampbfiers  C  and  D  implement  the  gyrator  [14];  and  the  serial  set  of  CMOS  diodes  and 
the  operational  amplifier  in  negative  feedback  configuration  implement  the  bitis  circuit. 

For  the  implementation  we  have  used  a2/rTn  double-metal  double-poly  CMOS  tech¬ 
nology.  The  most  relevant  parameters  of  this  technology  are  given  in  Table  III.  The 
cajmcitors  have  been  implemented  directly  by  using  the  two  poly  layers  as  capacitor 
plates  (capacitance  per  unit  area  is  470pF/mm^  in  our  technology).  The  voltage¬ 
mode  operational  amplifier  has  been  designed  using  the  two  stage  miUer-compensated 
topology  [12].  The  four  operational  transconductance  amplifiers  have  been  designed 
using  a  topology  based  on  simple  differential  pairs,  as  this  gives  the  maximum  effective 
frequency  response  and  minimal  input  noise. 
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Table  III.  Technolosifal  Data 


Parameter 

N-channel 

P-channel 

Unit 

Wk 

1.0 

O.S 

V 

fiCoT 

47 

23 

fxA/V^ 

7 

1.06 

0.45 

AL 

0.54 

0.42 

fim 

AW 

0.07 

0.17 

/iT7i 

Figure  13  shows  the  transistor  scheuiatir  topologt'  used  for  the  OTAs.  This  toi)ology 
is  the  same  for  all  the  OTAs,  but  each  is  designed  to  obtain  different  transconductanre 
characteristics.  The  transconductance  gain  of  each  of  them,  at  the  origin,  is  denoted 
a-s  ffn,  P6i  9c  and  pj,  respectively.  Table  IV  gives  their  nominal  values. 

The  transconductance  amplifiers  A  and  B  implement  the  nonlinear  Chua's  diode. 
They  determine  the  parameters  mo,  mj  and  £).  The  transconductance  ga  and  gt,  of 
OTA  A  and  OTA  B  should  be  equal  to  mi  -  m2  and  -mi.  respectively.  The  output 
current  of  OTA  A  is  limited  to  a  constant  value  when  v\  reach  the  breakpoint  £1  =  0.7 
V.  In  this  chip  the  OTA  A  and  B  have  fixed  characteristics,  that  are  set  by  the  self-bias 
circuit  at  the  left  of  Figure  13.  ^ 

The  necessary  nonlinearity  of  the  circuit  is  produced  by  the  cut-off  of  just  a  paiir  of 
transistors  of  the  differential  pair  of  OTA  A:  TlOl  (for  ul  <  £1 )  or  T102  (for  v\  >  E\ ). 
As  transitions  from  cut  off  to  conduction  can  cause  small  delays,  we  want  to  prevent 
any  other  transistors  in  the  signal  path  from  cutting  off.  We  achieve  this  by  shorting 
the  drains  of  TlOl  or  T102  of  OTA  A  with  the  equivalent  transistors  of  OTA  B.  This 
connection  (which  does  not  appear  in  Figure  12  to  avoid  clutter)  is  equivalent  to  the 
paraUel  connection  of  the  current  mirror  of  both  OTAs  *.  The  current  bias  of  OTA 
B  wih  always  maintain  all  current  mirrors  in  conduction.  Using  this  scheme  we  can 
obtain  a  driving  point  characteristic  for  the  Chua’s  diode  which  does  not  show  any 
measurable  hysteresis  phenomena  at  the  frequency  of  operation  centered  in  the  160 
KHz  range.  The  design  procedure  to  determine  all  the  transistor  dimensions  of  these 
two  amplifiers  can  be  found  in  [2].  For  our  particular  bias  levels  the  final  dimensions 
values  are  given  in  Table  V. 

The  transconductance  amplifiers  C  and  D  implement  the  gyrator.  They  determine 

®They  can  be  adjusted  if  an  externa]  pin  is  assigned  to  the  control  line  of  OTA  A.  We  have  recently 
used  that  scheme  in  a  Chua’s  diode  chip  prototype,  and  we  have  successfully  used  it  to  experimentally 
reproduce  bifurcation  phenomena  by  continuously  varying  the  slope  mo  (bifurcation  parameter  a). 

*  Ekjuivalently,  is  also  possible  to  merge  all  the  current  mirrors  of  OTA  A  with  those  of  OTA  B,  but 
increasing  accordingly  the  width  of  their  transistors 
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tlip  parameters  L  according  to  equation  (5).  The  capacitor  C3  has  a  fixed  value  of 
269  pF.  The  transconductances  and  are  controllable  in  order  to  get  a  variable 
inductor.  Their  nominal  values,  given  in  Table  I\’,  are  obtained  when  the  control  line 
of  OTA  C  (pin  7  of  the  chip)  is  left  open,  and  the  control  line  of  OTA  D  (pin  6  of  the 
chip)  is  connected  to  Rg  =  20A’fl  as  indicated  in  Figure  3.  Under  this  condition  the 
gvration  ratio  is  1.23  x  and  the  emulated  inductor  has  value  of  0.33mH.  This  value 
ran  be  changed  in  a  ±50  range  by  Rj  adjustment.  All  transistor  dimensions  of  these 
amplifiers  are  shown  in  Table  V. 

The  LC  circuit  formed  by  C’j  and  the  emulated  inductor  has  a  quality  factor  of 
Q  =  7S.5,  which  is  actually  higher  that  what  is  usually  obtained  by  using  discrete 
components^.  This  has  been  achieved  by  designing  gyrator  amplifiers  with  very  large 
ratios  between  their  transconductances  and  their  output  conductances. 


Table  IV.  Transconductance  Values 


OTA 

transconductance 

Unit 

A 

0.37 

mA/V 

B 

0.41 

mA/V 

C 

0.41 

mA/V 

D 

2.00 

mA/V 

typical  series  resistance  of  ISQ  in  the  inductor  will  degrade  the  quality  factor  to  approximately 
Q  =  25 


Table  V.  Mask  dimensions  of  the  internal  transistors  of  the  OTAs 


L  (Mill) 

T]02A>T\o2B 

40 

15 

15 

6 

Ti03A^Ti0bA^Ti06A 

2S0 

400 

400 

4 

T’loafl)  T’losfli  T’loeB 

280 

400 

400 

2 

T}04A 

2K0 

400 

3200 

4 

T\ozb 

280 

400 

3200 

2 

T\07A 

140 

200 

200 

4 

T’iotb 

140 

200 

200 

2 

Tw&a 

138 

200 

1600 

4 

F]08B 

138 

200 

1600 

2 

T\osa 

100 

476 

476 

6 

Tjo:  A 

50 

50 

50 

10 

^301 B 

100 

100 

100 

10 

Tzq2A^TzQ2B 

33 

33 

33 

10 

An  conventional  operational  amplifier  and  a  set  of  diodes  are  used  to  implement 
the  bias  circuit.  The  diodes  are  made  by  gate-to-dradn  connected  transistors.  They  are 
sized  so  that  their  midpoint  voltage  (vond)  is  just  in  the  middle  of  the  values  at  the 
v+  and  r_  nodes.  The  accuracy  of  this  voltage  division  is  not  critical,  as  the  circuit  is, 
to  first  order,  insensitive  to  variations  in  the  DC  voltage  difference  between  vcnd 
the  supply  nodes.  AC  variations  are  minimized  to  less  than  6  mV  by  the  use  of  a  high 
gain  conventional  operational  amplifier  with  negative  feedback. 

3.3  Physical  structure 

The  circuit  has  been  fabricated  in  2  fim  CMOS  technology  of  ORBIT  Semiconductors 
[1.5].  Figure  14  shows  a  micrograph  of  the  fabricated  circuit.  It  occupies  a  silicon  area 
of  2Jymm  x  2.Himn.  All  the  active  circuitry  is  the  central  part  of  the  upper  side  of  the 
die.  The  three  rectangular  blocks  at  the  bottom,  from  left  to  right,  are  respectively 
capacitors  C'l,  C2  and  C3. 

4  Numerical  simulations 

The  experimental  results  are  validated  with  numerical  simulations.  As  as  an  example, 
figure  15  shows  a  device-level  numerical  simulation  of  the  chip  in  chaotic  operation. 
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with  the  same  conditions  used  to  obtain  the  expermental  results  of  Figure  5.  The 
correspondence  with  the  experimental  data  shown  earlier  is  excellent. 

5  Concluding  Remarks 

In  this  paper  we  have  presented  a  working  microelectronic  Chua’s  circuit  which  pro¬ 
duces  chaotic  signals  whose  experimental  dynamics  are  in  in  close  concordance  with 
theoretical  and  numerical  predictions  based  on  the  ideal  Chua’s  equation  [5].  The  cir¬ 
cuit  occupies  an  area  of  7  square  millimeters  in  2^th  CMOS  technology.  In  a  large  die 
it  is  ])ossible  to  place  57  of  our  circuits.  The  number  of  possible  circuits  on  a  chi])  ran 
be  increased  to  about  600  by  applying  the  scaling  rules  given  in  Appendix  1. 

Our  major  motivation  for  this  work  was  the  need  of  imcroelectronir  circuits  that 
could  be  used  as  a  building  block  of  several  classes  of  systems  that  require  the  u.se  of 
chaotic  behavior.  Some  examples  are  secure  communication  systems  based  on  chaos 
synchronization  [10],  and  network  arrays  [11]  [16]  with  spatially  chaotic  dynamics.  The 
successful  implementation  of  these  systems  relies  upon  the  availability  of  a  chaotic 
electronic  component  exhibiting  experimental  dynamics  that  closely  resembles  a  math¬ 
ematical  model  and  that  can  be  accurately  controlled.  We  hope  that  our  design  will 
facilitate  the  VLSI  implementation  of  these  emerging  classes  of  circuits  and  systems 
making  use  of  chaotic  phenomena. 
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Appendix  I:  Scaling  rules  for  high-density  VLSI  imple¬ 
mentation  of  Chua’s  circuits. 

With  our  present  design  it  is  possible  to  design  chips  containing  up  to  57  circuits 
(assuming  a  large  die  size  of  20mm  x  20mm).  Those  interested  in  higher  integration 
densities  can  scale  down  the  capacitors  values  and  the  current  levels.  The  simplest  scal¬ 
ing  scenario  is  a  linear  scaling,  in  which  the  new  values  of  capacitors  and  conductances 
are  kC\,  kC-z,  kCj,  km-i,  kniz  and  where  k  is  the  scaling  parameter.  The  scaling 
of  the  capacitors  is  done  simply  by  reducing  the  area.  The  scaling  of  the  conductances 
is  done  by  reducing  proportionally  the  width  of  the  transistors  of  the  input  stage  of  the 
OTA  A  and  B.  The  gyrator  design  should  be  unchanged. 
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After  doing  this  scaling  the  circuit  wili  operate  at  the  same  frequency  and  we  will 
get  the  same  voltage  levels  for  all  variables.  All  the  currents,  however,  will  be  scaled 
according  with  the  same  factor  k.  As  both  the  capacitors  and  the  currents  are  scaled 
with  the  same  factor  the  GBW  of  all  transconductance  amplifiers  does  not  degrade  in 
first  order,  and  remain  above  the  operating  frequency  of  the  circuit. 

The  area  estimate  for  the  entire  microelectronic  Chua’s  circuit  in  square  millimeters 
it  is  equal  to  0.5  +  6.5(A:),  where  O.Smm^  is  the  area  of  the  nonscaling  elements  and 
6.57/im^  is  the  original  area  of  the  scalable  elements  The  lowest  value  of  k  is  determined 
by  several  factors,  including  noise  degradation,  parasitic  capacitances  that  affect  the 
dynamics,  and  degraded  amplifier  phase  margins  and  Lnearity  ranges.  If  we  consider  a 
realistic  lower  limit  of  A:  =  0.02,  this  gives  634  Chua's  circuits  per  chip.  Higher  densities 
can  be  achieved  by  using  more  advanced  technologies. 
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Figure  Captions 


Figure  1.  (a)  Chua’s  Circuit;  (b)  Driving-point  characteristic  of  Chua’s  Diode. 

Figure  2.  Photograph  of  the  IC  Chip. 

Figure  3.  A  typical  operating  configuration  of  the  chip,  with  the  bias  battery,  and 
two  external  potentiometers  to  set  two  bifurcation  parameters. 

Figure  4.  Experimental  time-domain  waveforms  of  the  state  variables,  (a)  t>)  vs. 
time;  (b)  vj  vs.  time;  (c)  vj  vs.  time.  (Vertical  scales  are  ll'/tliv.;  horizontal  scale  is 
50/xs/div.) 

Figure  5.  Experimental  Lissajous  figures  of  the  double-scroll  Chua’s  attractor.  All 
scales  are  5007)iV''/div. 

Figure  6  Detail  around  the  origin  of  the  projection  of  the  double-scroll  Chua’s 
attractor  onto  the  (ui,  U3)  plane. 

a)  Vertical  and  horizontal  scale  is  200jnV'/div.; 

b)  Vertical  and  horizontal  scale  is  lOOrnV'/div. 

Figure  7.  Experimental  limit  cycle  outside  the  double-scroll  Chua’s  attractor.  Pro¬ 
jection  into  the  (i>i,  V3)  plane.  (Vertical  scale  is  2V7div.;  horizontal  scaile  is  2V'/div.) 

Figure  8.  R-bifurcation  sequence.  Experimental  U]  vs  V3  Lissajous  figures.  (Vertical 
scale  is  IV/div.;  horizontal  scale  is  500mV/div.). 

(a)  R  =  2050D,  period  one; 

(b)  R  =  20150,  period  two; 

(c)  R  =  20090,  period  n; 

(d)  R  =  19740,  spiral  Chua’s  attractor; 

(e)  R  =  18870,  double-scroll  Chua’s  attractor  after  birth; 

(f)  R  =  15680,  double-scroll  Chua’s  attractor  before  dying. 

Figure  9.  /3-bifurcation  sequence.  Experimental  V2  vs  uj  Lissajous  figures.  (Vertical 
scale  is  500my/div.;  horizontal  scale  is  IV/div.) 

(a)  Rff  =  26.0A'O,  period  one; 

(b)  R^  =  25.0KQ,  period  two; 

(c)  Rff  =  24.6A"0,  period  four; 

(d)  R/j  =  24.0A'O,  spiral  Chua’s  attractor; 

(e)  R/3  =  22.0A'O,  double-scroll  Cbua’s  attractor  after  birth; 
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(f)  R0  =  IS.OA'n.  double-scroll  Chua’s  attractor  before  dying. 

Figure  10.  Network  schematic  of  the  circuit. 

Figure  11.  Network  schematic  of  the  circuit  including  bias. 

Figure  12.  Architecture  of  the  chip. 

Figure  13.  Transistor  schematic  of  the  OT.A.s. 

Figure  14.  Micrograph  of  the  fabricated  circuit. 

Figure  15.  Lissajous  figures  of  the  double-scroll  Chua  s  attractor  obtained  by  elec¬ 
trical  simulation.  All  scales  are  SOOml'/d''- 
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ABSTRACT 

Bulk  CMOS  technology  scaling  can  not  sustain  the  historical  rate  of  speed  increase.  A  realistic 
target  for  SOI  delay  and  power  reductions  are  40*5^  and  30^.  independent  of  scaling,  mostly 
through  capacitance  reduction.  Denser  isolation  allows  more  compact  layout  and  easy 
integration  of  different  high  speed  (E/D  NMOS).  low  power  (CMOS)  and  analog  (bipolar, 
grounded-body  CMOS)  devices.  Silicon  device  speed  record  (13  ps  at  l.W.  300K)  has  been  set 
with  SOI  E/D  NMOS.  Leakage  current  due  to  steady  state  and  transient  floating-body  induced 
threshold  lowering  (FTTL)  is  a  difficult  device  issue. 


The  Triend  of  Bulk  Silicon  Technology  Scaling 

The  importance  of  electronics  in  the  economic, 
ocial  and  even  political  development  throughout  the 
world  will  all  but  guarantee  continued  rises  in  circuit 
integration  density  and  speed.  It  is  less  clear  if  bulk 
silicon  technology  can  meet  the  historical  trend  of 
speed  improvement.  A  recent  study  suggests  that  the 
speed  trend  can  not  be  sustained  [1].  I^j^j  per  unit 
channel  width  ceases  to  increase  with  technology 
scaling  beyond  the  O.Spm  technology.  Even  when  we 
examine  the  high-speed  scenario,  where  Vcc  reduction 
is  delayed  as  much  as  reliability  consideration  might 
allow  [1],  I^gj  still  ceases  to  increase.  The  unpleasant 
consequence  on  circuit  speed  is  shown  in  ^ig.  1). 
Instead  of  the  historical  speed  doubling  every  two 
generations,  designers  will  need  to  work  with  speed 
doubling  every  four  generations. 


Fig.l.  Bulk  CMOS  scaling  can  not  sustain  the  rate  of  speed 
doubling  every  2  generation  beyond  0.35pm  technology  [1]. 


Capacitance  Reduction  with  SOI 

The  most  often  cited  advantage  of  SOI  technology 
is  higher  speed  due  to  reduction  of  junction 
capacitance  because  of  the  buried  oxide.  Comparison 
of  bulk  and  SOI  circuit  power  consumption  provides 
the  most  direct  data  [2).  The  ratio  of  power, 

C  is  equal  to  the  ratio  of  circuit 

capacitance.  Both  data  and  calculations  shown  in 
(Fig.  2)  suggest  that  SOI  circuits  have  approximately 
two  third  the  capacitance  of  bulk  circuits. 


1 

C  (SOI) 

C  (bulh) 

C(SOI)  /C(bulk) 

Activ«G«i«(F.O.s1) 

C  ei 

M.l  (F 

37.6  IF 

0J7 

N' Junction  (1  0r»jn) 

c,,». 

9S  IF 

16.9  IF 

0.60 

P*Jwnciien  (1  tfriin) 

C  i(*i 

7.6  IF 

21.6  IF 

0.3$ 

Poiyoiicon  {10 

0.«3  IF 

0.98  IF 

0.44 

Itl  AWmlmim  (imm) 

c,*, 

716  IF 

123.3  IF 

0.59 

2nd  Aluminum  (lmm) 

Cm 

63.9  IF 

96.4  IF 

0.65 

Cm 

ESSS 

c 

■■■1 

Fig.  2.  CompaiiscHi  of  circuit  power  consumption  has 
confinned  that  typical  circuit  capacitance  is  reduced  to  V3  of 
the  bulk  circuit.  Buried  oxide  is  SOOnm  thick  [2], 

We  expect  this  capacitance  advantage  to  remain 
relatively  constant  independent  of  scaling.  Buried 
oxide  needs  to  be  “electrically”  thicker  than  or 
physically  as  thick  as  the  depletion  region  under  the 
source/drain. 


.  I 
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Subthreshold  Current  and  Floating-body  Induced 
Threshold  Lowering  (FITL) 

-  There  are  three  components  of  MOSFET  leakage 
current  [1].  One  is  bulk  leakage  often  referred  to  as 
puncthrough.  SOI  eliminates  this  leakage  path  easily. 
The  second  component  is  a  surface  leakage 
component  known  as  drain-induced  barrier  lowering, 
V-j-  lowering,  or  short  channel  effect, 

7^(  V  =0)  «  10”'^'^'^  SOI  offers  an  opportunity  to 
bring  S  close  to  the  limit  of  2.3I:T  or  60mV/decade 
through  the  use  of  fully-depleted  thin-film  SOI  [3]. 
Unfortunately,  as  increases,  drain-body  junction 
leakage  and  gate-induced  drain  leakage  [4]  cause 
holes  (in  the  case  of  NMOSFET)  to  flow  into  the 
floating  body.  This  raises  the  body  potential  and 
hence  lowers  Vj  and  increases  the  leakage  (Fig.  3) 
indef)endent  of  channel  length.  This  can  happen  even 
when  the  silicon  film  is  fully  depleted.  This  floating- 
body  induced  leakage  is  a  very  serious  and  difficult 
problem,  especially  when  one  considers  transient  Vj 
drift. 


0  .1  .2  .3  .4  .5  .6  .7  .8  .9  1 


Effective  Channel  Length  (am) 

Fig.  3.  Floating-body  induced  threshold  lowering  (FITL) 
lowers  Vj  and  raises  subthreshold  leakage  at  high  Vj,  even 
in  long-channel  devices.  Gate  oxide  is  42nm  [5]. 

There  are  several  potential  solutions  —  raise  Vj 
to  allow  a  margin,  provide  a  contact  to  the  body,  or 
make  body/source  “leaky”.  We  believe  there  is  an 
important  device  design  concept  —  use  light  body 
doping  so  that  there  is  minimal  potential  variation 
across  the  silicon  film  thickness.  This  “uniform 
barrier”  design  will  minimize  the  barrier  against  hole 
flow  into  the  source  for  a  given  barrier  against 
electron  flow  into  the  channel  (the  subthreshold 
current). 

Enhanced  MOSFET  Current? 

Although  reports  on  SOI  devices  typically  show 
lower  than  bulk  devices  of  the  same  oxide  and 
channel  dimensions,  SOI  MOSFET  can  potentially 
produce  larger  than  bulk  device  as  shown  in 
(Fig.  4)  [6]. 
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Fig  4  SOI  MOSFET  can  provide  larger  current  than  bulk 
devices  due  to  reduetjon  in  Vj  and  bulk  charge  is  S/D 
reustaiKt  is  not  excessive.  The  5.5nm  gate  SOI  PMOSFETs 
produced  the  highest  trancooductancc  ever  reponed  [61. 

The  most  important  reason  for  SOI’s  larger  I<isa,  in 
future  low  operation  is  the  possibility  of  lower  Vj 
(1]  —  tf  floating-body  induced  Vj  lowering  can  be 
controlled.  Otherwise,  FITL  still  leads  to  an  effective 
reduction  in  V-j-g  of  about  0.15V  at  high  and 
enhanced  in  steady  state  [6].  Finally,  reduced 
bulk  charge  effect  (71  can  increase  I<isai  hy  around 
10%  [6]  if  the  buried  oxide  is  effectively  much  thicker 
than  the  bulk  depletion  region  thickness.  On  the  other 
hand  self  heating  and  thin  SOI’s  higher  S/D  resistance 

reduce  Idsai-  . 

Overall,  we  expect  about  the  same  Ijjsat  SOI  and 
bulk  MOSFET’s,  with  about  10%  advantage  toward 
SOI  especially  at  very  low  V(y,  with  thicker  buried 
oxide  and  salicided  S/D. 

Reliability  and  Technology  Issues 

In  spite  of  high  dislocation  density  and  metal 
impurity  concentrations,  SIMOX  as  well  as  bonded 
SOI  materials  appear  to  be  capable  of  producing  bu^ 
quality  gate  oxide  [8].  Hot  carrier  reliability  is 
compromised  due  to  charge  trapping  in  the  buried 
oxide.  However  effective  graded  LDD  can  be 
produced  without  trade-off  with  junction  depth. 
Adequate  hot  electron  reliability  is  predicted. 

More  than  speed,  leakage  and  reliability  issues. 
Manufacturability  will  likely  decide  SOFs  future.  In 
this  respect,  SOI  has  several  advantages  in  isolations, 
latch-up,  shallow  junction,  contact  formation,  layout 
density,  etc.  It  is  worth  noting  that  there  are  novel  and 
intriguing  SOI  material  and  device  ideas.  One 
example  would  produce  dense,  vertical  double-gated 
thin  SOI  devices  using  a  bulk  silicon  substrate  as  the 
starting  material  [9]. 

Conclusion  and  Discussion 

A  critical  review  suggests  that  bulk  technology 
scaling  can  not  sustain  the  historical  rate  of  speed 
increase.  SOI  reduces  circuit  capacitance  by  30%,  and 
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can  potentially  increase  MOSFET  current  by  10% 
through  reduction  in  Vj  and  bulk  change.  Ease  of 
device  isolation  allows  SOI  technology  to  integrate 
CMOS,  complementary  BJT,  E/D  NMOS  [10],  high 
voltage  device  and  analog  MOSFETs  with  body 
contacts.  Fastest  silicon  circuit  delay  record  has  been 
set  with  SOI  E/D  NMOS  (13  ps  at  1.5V  and  300K) 
[11]  (Fig-  5).  Highest  PMOSFET  transconductance 
record  has  been  set  with  SOI  technology  (Fig.  4). 
Manufacturability  advantages  may  favor  SOI  as  the 
main-stream  technology  beyond  the  0.15  [im 
technology.  Moderately  thin  Oimited  by  S/D 
resistance)  fully  depleted  SOI  on  moderately  thick 
(limited  by  self  heating)  buried  oxide  is  the  most 
attractive  arrangement.  “Uniform  barrier”  design  is 
proposed  to  minimize  floating-body  induced 
threshold  lowering  (FTTL). 


Fig.  5.  Fastest  silicon  transistor  speed  record,  13ps,  has 
been  set  with  E/D  NMOS  on  SOI  technology.  101  inverter 
ring  oscillator,  Vdj=1.5V.  T<„=7nm,  Tji=50nin, 
i<jy-0.15nm,3(X)K[ll]. 
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A  systematic  study  of  the  processing  procedures  required  for  minimizing  structural 
defects  generated  during  the  ion  beam  synthesis  (IBS)  of  SiGe  aUoy  layers  has  been 
performed.  The  synthesis  of  200  nm  thick  SiGe  alloy  layers  by  implantation  of  Gc  ions 
with  an  incident  energy  of  120  keV  into  <100>  oriented  Si  wafers  yielded  various  Ge 

peak  concentrations  after  the  following  doses,  2xl0^^cm  ^  3xl0'^cm  ^  and  SxlO^^cm 
Following  implantation,  SPE  annealing  in  a  nitrogen  ambient  at  800°C  for  1  hour  resulted 
in  only  slight  redisnibution  of  the  implanted  Gc.  Two  kinds  of  extended  defects  were 

obser\’ed  in  alloy  layers  synthesized  at  doses  over  3x10  cm'  at  room  temperature:  end- 
of-range  (EOR)  dislocation  loops  and  strain-induced  stacking  faults.  The  density  of  EOR 
dislocation  loops  was  much  lower  in  those  alloys  produced  by  liquid  nitrogen  temperature 
(LNT)  implantation  than  by  room  temperature  (RT)  implantation.  Decreasing  the 
implantation  dose  to  obtain  5  at%  peak  Gc  concentration  prevents  strain  relaxation, 
while  those  SPE  layers  with  more  than  7  at%  Ge  peak  show  high  densities  of  misfit- 
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induced  stacking  faults.  Sequential  implantation  of  C  following  high  dose  (5xlo’^/cm^) 
Ge  implantation  (12  atVc  Ge  peak  concentration  in  the  layer)  brought  about  a  remarkable 
decrease  in  density  of  misfit-induced  defects  (stacking  faults).  When  the  nominal  peak 
concentration  of  implanted  C  was  greater  than  0.55  at%,  stacking  fault  generation  in  the 
epitaxial  layer  was  considerably  suppressed.  This  effect  is  attributed  to  retrain 
compensation  by  C  atoms  in  the  SiGc  lattice.  A  SiGc  aUoy  layer  with  0.9  at%  C  peak 
concentration  under  a  12  at%  Ge  peak  exhibited  the  best  microstnicture.  The 
expenmental  results,  combined  with  a  simple  model  calculation,  indicate  that  the  optimum 
Ge/C  ratio  for  strain  compensation  is  between  1 1  and  22.  The  interface  between  the 
amorphous  and  regrown  phases  (a/c  interface)  showed  a  dramatic  morphology  change 
during  its  migration  to  the  surface.  The  initial  <1(X)>  planar  interface  decomposes  into  a 
<1 1 1>  faceted  interface,  changing  the  growth  kinetics.  These  phenomena  are  associated 
with  strain  relaxation  by  stacking  fault  formation  on  (1 1 1)  planes  in  the  a/c  interface. 
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ABSTRACT 

In  previous  work,  we  and  others  have  shown  that  band¬ 
pass  filtering  of  temporal  trajectories  of  simple  functions  of 
the  critical  band  spectrum  can  lead  to  mote  robust  speech 
recognizers  in  the  presence  of  additive  and  convolutional  er¬ 
ror  In  this  study  we  report  results  on  several  mechanisms 
for  incorporating  this  analysis  technique  into  training,  in 
a  way  that  is  consistent  with  on-line  approaches  to  speech 
recognition.  In  particular,  we  show  improved  robustness 
to  these  forms  of  degradation  for  a  system  that  maps  the 
filtered  spectral  points  using  a  linear  regression  computed 
from  results  of  the  different  transformations. 

1.  INTRODUCTION 

It  has  been  demonstrated  (l](2]  that  bandpass  fil¬ 
tering  of  temporal  trajectories  of  the  critical-band 
spectrum  (when  it  has  been  processed  by  a  nonlin¬ 
ear  transformation)  is  efficient  in  alleviating  some 
harmful  effects  of  both  additive  and  convolutional 
noise.  While  the  technique  appeared  to  be  effec¬ 
tive,  it  raised  two  new  problems: 

1.  The  optimal  form  of  the  nonlinearity  is  depen¬ 
dent  on  the  noise  level.  Thus,  the  noise  power 
needs  to  be  estimated  for  the  analysis. 

2.  Since,  depending  on  the  estimated  noise  level, 
a  different  compressive  nonlinearity  may  be 
applied  in  the  analysis,  the  result  is  dependent 
on  the  noise  level.  In  a  sense  this  is  a  trade  of  a 
deterministic  source  of  variance  (the  different 
nonlinearities  used)  for  a  stochastic  one  (the 
actual  additive  or  convolutive  noise). 

Previous  work  [1]  simply  estimated  the  noise 
power  from  the  non-speech  part  of  the  signal  to 
address  the  first  problem.  The  second  problem 
was  addressed  by  using  multiple  templates  derived 
from  the  clean  speech  using  a  range  of  nonlinear¬ 
ities  corresponding  to  the  range  of  expected  noise 
levels. 


In  the  current  work  we  estimate  noise  without 
requiring  explicit  speech  detection.  Further,  we 
investigate  three  different  techniques  for  compen¬ 
sating  for  the  effect  of  the  variable  nonlinearity. 
The  RASTA  models  derived  for  recognition  need 
to  match  the  models  derived  during  training.  This 
was  always  true  for  the  early  forms  of  RASTA  in 
which  the  nonlinearity  was  fixed  (a  logarithm),  but 
is  nontrivial  for  a  nonlinearity  whose  value  is  de¬ 
pendent  on  an  adaptively  determined  parameter 
(noise  level). 

2.  BACKGROUND 

The  basic  idea  of  RASTA  processing  is  to  filter  the 
temporal  trajectories  of  speech  parameters  (e.g., 
critical  band  values)  after  they  have  been  trans¬ 
formed  by  a  static  nonlinearity  that  (ideally)  con¬ 
verts  the  major  sources  of  environmental  interfer¬ 
ence  into  an  additive  component.  Over  the  last 
year  we  have  been  experimenting  with  a  parame¬ 
terized  family  of  functions 

y.  =  iog(i-f,;A'.)  (1) 

where  i  is  the  critical  band  number. 

For  large  values  of  JXi,  this  function  is  close 
to  logarithmic,  while  for  small  values  it  is  close 
to  linear.  Experiments  reported  in  [1][2]  showed 
that  the  optimal  value  for  J  is  dependent  on  the 
instantaneous  noise  power.  To  estimate  this  noise 
power,  we  use  an  approach  developed  by  IIirsch[3] 
which  uses  the  position  of  the  principal  mode  of 
the  histogram  of  energy  in  each  frequency  band  as 
the  noise  power  estimate  for  the  band.  The  sub¬ 
band  estimates  are  currently  combined  for  a  ro¬ 
bust  estimate  of  the  total  noise  power.  This  noise 
estimation  technique  does  not  require  any  speech 
pause  detection. 


Though  the  overall  processing  has  been  shown 
to  provide  some  robustness,  a  drawback  remciins: 
the  choice  of  different  J  values,  as  required  by  dif¬ 
fering  noise  conditions,  generates  different  spectral 
shapes  and  dynamics  of  the  spectra.  This  means 
that  the  training  system  must  contend  with  a  new 
source  of  variability  due  to  the  change  in  process¬ 
ing  strategy  that  is  adaptively  determined  from 
the  data.  The  rest  of  this  paper  is  concerned  with 
the  solution  of  this  difficulty. 

3.  APPROACHES  TO  HANDLING  J 
VARIABILITY 

We  have  been  working  on  three  approaches  to  han¬ 
dling  this  variability: 

1.  Multiple  recognizers  -  several  systems  can  be 
trained  using  a  different  J  value  for  each  one. 
Although  clean  speech  is  used  for  each  train¬ 
ing,  the  differing  J  factors  provide  a  range  to 
include  the  nonlinear  function  for  cases  that 
will  be  encountered.  In  the  recognition  phase, 
noise  estimation  is  used  to  select  a  J  value, 
and  the  corresponding  recognizer  is  used.  As 
will  be  shown,  this  works  well,  but  several  rec¬ 
ognizers  must  be  trained. 

2.  Multiple  J  values  for  one  recognizer  -  given 
enough  degrees  of  freedom  in  the  trained  sys¬ 
tem,  training  data  can  be  processed  for  trtiin- 
ing  with  a  range  of  plausible  values  for  J. 
This  only  requires  training  a  single  system, 
but  since  this  technique  eflfectively  increaises 
the  size  of  the  training  set,  it  requires  more 
computing  and  possibly  also  more  parameters 
in  the  classifier  to  account  for  the  added  vari¬ 
ability. 

3.  Spectral  mapping  -  the  noise-level  dependent 
choice  of  J  introduces  a  deterministic  source  of 
variability  into  the  analysis,  which  one  should 
be  in  principle  be  able  to  compensate  for.  To 
this  date,  however,  we  have  not  determined 
a  satisfactory  analytic  solution  to  this  prob¬ 
lem,  and  therefore  we  have  decided  to  apply 
an  empirically  derived  linear  mapping  which 
would  transform  the  spectrum  obtained  from 
a  J  value  corresponding  to  noisy  speech  to  a 
spectrum  processed  with  a  J  value  for  clean 
speech.  In  other  words,  we  find  a  mapping  be¬ 
tween  log{\  -f  Jx)  and  log{\  -1-  Jrejx).  For 


this  approach,  we  have  used  a  linear  regres¬ 
sion  within  each  critical  band.  In  principle, 
this  solution  reduces  the  variability  due  to  the 
choice  of  J,  and  so  minimizes  the  effect  on  the 
training  process. 

In  the  next  section  we  describe  experiments  to 
test  these  three  methods. 

4.  EXPERIMENTS  AND  RESULTS 

We  tested  our  approaches  with  a  standard  HMM 
recognizer  which  was  built  with  the  HMM-Toolkit 
(HTK)(4].  The  recognizer  used  10-state  word- 
based  MMMs,  with  8  emitting  states  and  output 
probability  distributions  based  on  N-Gaussian  di¬ 
agonal  covariance  matrices.  The  variances  were 
tied  across  all  HMM  states  of  all  models  (grand 
variances).  The  speech  was  processed  using  a 
25  ms  Hamming  window,  and  then  parameterized 
into  9  PLP-cepstral  values.  The  test  database  con¬ 
sisted  of  13  isolated  digits  spoken  by  200  speak¬ 
ers  over  dialed-up  telephone  lines.  All  words  were 
hand  end-pointed.  To  get  enough  training  data  to 
model  the  HMMs  we  divided  the  set  of  200  speak¬ 
ers  into  150  speakers  for  training  and  50  speakers 
for  testing.  A  jackknife  procedure  was  used  so  that 
all  speakers’  data  could  be  tested  on,  resulting  in 
4  iterations  (no  overlap  of  testing).  To  balance 
for  the  number  of  parameters,  we  used  4  mixtures 
per  state  for  all  cases  but  that  of  4  recognizers; 
for  this  case  we  used  a  single  mixture  per  slate  (a 
greater  number  of  mixtures  actually  didn’t  sub¬ 
stantially  change  performance  for  an  earlier  pilot 
experiment).  To  simulate  additive  noise  we  syn¬ 
thetically  added  car  noise  to  the  clean  (>  20  dB 
SNR)  speech  to  yield  a  10  dB  SNR  level.  Convolu- 
tionM  noise  wels  introduced  by  filtering  the  speech 
with  a  linear  filter  simulating  the  spectral  ratio 
between  an  electret  and  carbon  microphone.  The 
recognition  results  are  presented  in  Table  1.  The 
first  row  gives  the  results  when  the  environment 
for  train  and  test  phases  are  identical,  and  is  in 
some  sense  a  best  ctise  scenario  for  non- RASTA 
processing;  often  the  testing  condition  is  not  avail¬ 
able  during  training.  In  all  other  rows  the  training 
conditions  were  always  “clean”,  i.e.,  the  additive 
and  convolutional  errors  were  only  applied  to  test 
data.  The  second  and  third  row  show  the  results 
obtained  with  PLP  and  log  RASTA  processing. 


Note  that  log  RASTA  (called  RASTA  here)  re¬ 
duces  the  error  rate  for  the  filtered  case  but  is  not 
effective  for  additive  noise.  In  this  task,  RASTA 
also  appears  to  slightly  improve  the  discriminabil- 
ity  between  the  word  claisses  in  the  clean  case,  as  in 
fact  one-third  of  the  errors  were  eliminated  with  a 
log  RASTA  front  end  (with  respect  to  a  PLP  front 
end). 

The  results  using  multiple  recognizers  are  shown 
in  the  fourth  row  (J-RASTA-mult).  This  appears 
to  work  reasonably  well  in  comparison  with  PLP 
or  log  RASTA,  but  there  is  still  a  noticeable  degra¬ 
dation.  In  addition,  there  is  a  significant  perfor¬ 
mance  loss  for  the  clean  data. 

The  next  row  (J-RASTA-uni)  uses  one  recog¬ 
nizer  with  data  processed  using  different  values  of 
J.  This  is  an  HMM  version  of  our  multi-template 
approach  [1]  and  appears  to  work  better  than  the 
multiple  recognizer  technique,  both  for  clean  and 
noisy  cases.  This  case  only  requires  a  single  recog¬ 
nition  step,  and  so  is  a  fairly  straightforward  way 
of  incorporating  J-RASTA  into  a  recognition  sys¬ 
tem.  However,  it  does  still  require  training  with 
multiple  processings  of  the  training  data,  which 
increases  training  time. 

The  final  row  shows  the  results  from  the  linear 
mapping  of  filtered  critical  band  values.  In  this 
case,  J-RASTA-filtered  critical  band  outputs  from 
10  speakers^  are  used  to  train  linear  regression 
models.  We  have  used  2  coefficients  for  each  of 
15  critical  bands.  Thus,  we  map  the  J-RASTA- 
filtered  values  for  small  J  (high  noise)  to  the  cor¬ 
responding  vcdues  for  a  larger  J  (low  noise).  In 
particular,  for  each  of  3  different  values  of  J  (lO"’^, 
10“®,  and  10“®),  we  compute  a  mapping 

W,j  =  Ci  +  C2Y,j  (2) 

where  Yij  is  the  J-RASTA-filtered  output  for 
critical  band  i,  and  the  coefficients  are  determined 
to  minimize  the  mean-squared  error  between  Wu 
and  Yu^^^ . 

The  recognizer  Wcis  trained  with  clean  speech 
processed  with  J  =  10“®,  and  during  recogni¬ 
tion  the  optimal  value  of  J  was  determined  by  a 
local  estimate  of  noise  level  for  the  isolated  digit. 
Then  the  J-RASTA-filtered  critical  band  outputs 

*  Nine  of  these  speakers  were  independent  of  the  test  set;  the 
tenth  was  one  of  the  200  speakers  in  the  final  testing. 
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were  linearly  mapped  using  the  regression  coeffi¬ 
cients  computed  earlier.  The  performance  of  this 
method  appears  to  be  quite  good.  In  particular, 
the  score  for  the  clean  case  is  essentially  the  same 
as  for  RASTA  (in  this  case  actually  better  than 
for  PLP),  while  the  mapping  approach  for  the 
degraded  cases  are  better  than  for  the  other  ap¬ 
proaches  (roughly  equivalent  to  the  J-RASTA-uni 
approach).  Unlike  the  other  approaches,  recog¬ 
nition  and  training  are  both  the  same  as  for  log 
RASTA,  as  only  a  simple  deterministic  mapping 
is  required  in  the  front  end. 

5.  DISCUSSION 

The  techniques  described  here  permit  incorpora¬ 
tion  of  J-RASTA  processing  in  an  HMM-based 
recognizer,  at  least  for  a  small  vocabulary  isolated 
word  recognition  task.  However,  the  first  two  of 
the  three  increase  training  time.  The  third  (lin¬ 
ear  mapping)  approach  appears  preferable  from 
the  data  shown  here  (although  the  train-with-all 
J-RASTA-uni  approach  gives  slightly  higher  per¬ 
formance  for  one  condition).  However,  we  do  not 
have  enough  experience  with  this  method  to  know 
whether  the  mapping  is  task  or  data-dependent. 

While  these  techniques  do  provide  significant  ro¬ 
bustness  to  additive  and  convolutional  noise,  it 
is  clear  that,  in  comparison  to  the  performance 
on  clean  speech,  there  is  a  significant  increase  in 
error  which  remains.  Aside  from  the  smoothing 
they  provide  for  fast  non-speech  events,  RASTA 
techniques  only  handle  the  constant  (or  slowly- 
varying)  components  of  non-linguistic  variation. 

We  close  with  some  caveats  about  the  use  of 
RASTA.  In  the  2  years  since  we  first  reported  some 
RASTA  results  on  recognition,  many  sites  have 
experimented  with  related  features.  Due  to  the 


many  different  conditions  under  which  these  tests 
were  done,  results  varied  from  wonderful  success 
to  dismal  failure  with  many  cases  falling  in  be¬ 
tween.  Fortunately,  this  variance  does  not  appear 
to  have  a  random  cause;  we  have  learned  a  few 
things  about  the  use  of  RASTA  in  recognition  of 
speech.  Some  of  these  points  are: 

•  RASTA  increases  the  dependence  of  the  data 
on  its  previous  context.  Therefore,  simple 
context-independent  subword-unit  recognizers 
can  be  degraded  by  RASTA.  We  have  seen 
that  RASTA  has  worked  well  in  tasks  with 
whole  word  models  (such  as  the  one  reported 
here),  or  in  phoneme-based  recognizers  that 
used  triphones  or  broad  temporal  input  con¬ 
text  (the  latter  being  used  for  our  neural- 
network  recognizers). 

•  Log  RASTA  does  not  address  the  problem  of 
additive  noise.  J-RASTA  in  one  of  the  forms 
described  here  appears  to  be  able  to  handle 
both  additive  and  convolutional  noise  reason¬ 
ably  well. 

•  Some  RASTA  users  have  had  difficulty  with 
initial  conditions.  One  needs  to  be  aware  that 
RASTA  incorporates  a  filter  with  a  signifi¬ 
cant  memory,  and  thus  is  different  from  the 
well-established  short-term  spectral  analysis  of 
speech  in  which  each  analysis  frame  is  entirely 
independent  of  its  surroundings.  To  illustrate 
this  point,  we  originally  had  difficulty  in  the 
experiments  reported  here  when  some  test  files 
started  off  with  a  non-audio  artifact  which  it¬ 
self  was  cut  off  prior  to  pattern  matching,  but 
whose  effect  spread  well  into  the  useful  part  of 
the  speech  data  due  to  the  RASTA  processing, 
degrading  the  performance. 
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Abstract 

We  model  the  texture  distortion  at  a  point  in  any 
particular  direction  on  the  image  plane  as  an  affine 
transformation  and  derive  the  relationship  between 
the  parameters  of  the  affine  transformation  and  the 
surface  shape  and  orientation.  We  uae  a  technique 
for  estimating  affine  transforms  between  nearby  image 
patches  which  is  based  on  solving  a  system  of  linear 
constraints  derived  from  a  differential  analysis.  It  is 
not  necessary  to  explicitly  identify  texels  or  make  re¬ 
strictive  assumptions  about  the  nature  of  the  image 
texture  like  isotropy.  We  have  developed  two  differ¬ 
ent  algorithms  for  recovering  surface  orientation  and 
shape  based  on  the  estimated  affine  transforms  in  a 
number  of  different  directions.  The  first  is  a  simple 
linear  algorithm  based  on  singular  value  decomposi¬ 
tion.  The  second  is  based  on  nonlinear  minimization 
of  a  least  squares  error  criterion.  Experimental  results 
are  presented  on  images  of  planar  and  curved  surfaces 
under  perspective  projection. 

1  Introduction 

Traditionally,  researchers  have  formalized  the  no¬ 
tion  of  texture  gradient  as  that  of  finding  the  gra¬ 
dient  of  certain  scalar  valued  functions  such  as  fore¬ 
shortening,  area,  density,  compression  or  scaling.  The 
mathematical  relationship  between  these  gradients 
and  scene  geometry  has  been  developed  both  for  pla¬ 
nar  surfaces[16]  and  curved  surfaces  [8].  The  main 
difficulty  with  the  use  of  texture  gradients  is  that  it 
has  proven  difficult  to  develop  algorithms  for  estimat¬ 
ing  the  individual  texture  gradients  that  do  not  rely 
either  on  explicit  texel  identification,  e.g.  Blostein  and 
Ahuha  [3],  or  on  restrictive  assumptions  about  the  na¬ 
ture  of  the  surface  texture  such  as  isotropy.  Further¬ 
more,  Carding  has  shown  that  the  simple  distortion 


gradients  do  not  contain  enough  information  for  mea¬ 
surement  of  complete  local  surface  curvature  e.g.  sign 
of  Gaussian  curvature 

Another  major  family  of  approaches  to  shape  from 
texture  in  the  computer  vision  literature  is  based  on 
making  some  a  pnon  assumption  about  the  texture. 
The  assumption  most  often  used  is  that  of  isotropy  or 
weak  isotropy  of  the  texture  [18,  5,  4,  2].  Under  pro¬ 
jection.  the  texture  will  not  generally  appear  isotropic, 
and  thus  they  use  the  deviation  from  isotropy  in  the 
projection  to  infer  3D  shape  and  orientation.  There 
are  two  major  weaknesses  of  such  an  approach:  (1) 
It  cannot  deal  with  directional  texture  such  as  grass, 
fabrics,  etc.  (2)  It  makes  only  partial  use  of  available 
information,  e.g.  it  does  not  exploit  the  change  in  size 
of  the  projected  texture. 

The  other  assumption  which  has  been  used  in  the 
literature  is  that  of  homogeneity,  i.e.  that  the  texture 
pattern  has  constant  area  or  density[9,  1,  10,  17,  14], 
This  is  a  more  reasonable  assumption  for  natural  tex¬ 
tures,  and  our  first  criticism  doesn’t  apply.  However, 
this  assumption  is  too  weak— it  fails  to  exploit  the  sys¬ 
tematic  change  in  shape  of  the  texture  elements. 

Obviously,  some  assumption  about  the  texture  is 
necessary,  otherwise  what  we  are  seeing  could  in  fact 
just  be  a  particular  pattern  of  reflectance  changes  on 
a  flat  surface  (as  in  a  Renaissance  painting).  We  will 
assume  that  the  texture  is  the  same  at  different  points 
on  the  surface  in  the  scene. 

We  believe  that  the  natural  way  to  model  the  tex¬ 
ture  distortion  locally  is  as  a  2-D  affine  transforma¬ 
tion  between  neighboring  image  patches.  We  find  the 
affine  transformations  between  two  image  patches  us¬ 
ing  a  differential  method  (see  [12]).  In  section  2  we 
derive  the  relationship  between  the  texture  distortion 
map  and  all  five  surface  orientation  (slant  and  tilt) 
and  shape  (3  parameters:  principal  curvatures  and  di¬ 
rections)  of  the  surface,  by  exploiting  previous  math- 
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Figure  1;  Determining  the  aiHne  transformation, 
A,  between  the  texture  at  point  pi  and  the  tex¬ 
ture  at  point  p2. 


ematical  analysis  of  texture  gradients  on  curved  sur¬ 
faces  by  Carding  [8]. 

In  Section  3  of  the  paper,  we  develop  two  algorithms 
for  recovering  surface  orientation  and  shape  based  on 
the  estimated  affine  transforms  in  a  number  of  dif¬ 
ferent  directions.  The  second  algorithm  will  give  us 
estimates  of  all  five  shape  parameters.  We  know  of 
no  previous  researchers  who  have  estimated  curvature 
parameters,  though  several  researchers  have  derived 
equations  relating  the  local  shape  parameters  to  var¬ 
ious  texture  gradients  [8,  10].  We  present  simulation 
results  on  a  number  of  examples  of  planar  and  curved 
surfaces. 


2  Relationship  between  the  texture 
distortion  map  and  3D  shape 

We  argued  previously  that  instead  of  studying  tex¬ 
ture  gradients,  one  should  model  texture  distortion  in 
a  particular  direction  on  the  image  plane  as  an  affine 
transformation.  This  section  will  develop  the  relation¬ 
ship  between  the  parameters  of  this  affine  transfor¬ 
mation  and  the  surface  shape  and  pose,  for  spherical 
perspective  projection. 

Figure  1  depicts  the  situation.  We  wish  to  find  the 
matrix.  A,  which  represents  the  affine  transformation 
between  the  spherically  projected  texture  at  point  pi 
and  the  projected  texture  at  a  nearby  point  p2-  This 
matrix  will  be  a  function  of  the  local  orientation  and 
shape  parameters.  The  orientation  parameters  are  a, 
the  slant  of  the  surface,  and  t,  the  direction  of  tilt  of 
the  surface.  The  shape  parameters  are  r/Ct.  rat,  and 
rr,  where  at  is  the  normal  curvature  in  the  tilt  di¬ 
rection,  at  the  normal  curvature  in  the  perpendicular 
direction,  and  r  is  the  geodesic  torsion.  The  Gaussian 
curvature  is  K  =  a,at  —  r^.  The  variable  r  is  the 
distance  from  the  center  of  the  viewing  sphere  to  the 
given  point  on  the  surface.  Note  that  the  inseparabil¬ 


ity  of  the  distance,  r,  and  the  curvature  parameters  is 
inherent  to  the  problem.  The  image  of  a  surface  S  at 
distance  r  is  indistinguishable  from  that  of  a  lb  scaled 
copy  (for  which  the  curvatures  will  be  \/k  of  5)  at  a 
distance  of  kr. 

To  find  the  affine  transformation,  we  first  backpro- 
ject  from  the  point  pj  on  the  viewsphere  to  the  cor¬ 
responding  point  Pj  on  the  surface,  using  the  map 
F.(pi).  Let  t  be  a  unit  vector  in  the  tilt  direction 
for  some  point  p.  Then  if  p  is  the  unit  normal  to 
the  viewing  sphere  in  the  direction  of  the  surface,  let 
b  =  p  X  t  Then  (t,  b)  forms  an  orthonormal  basis  for 
the  tangent  plane  to  the  viewing  sphere  at  point  p. 
t  and  b  backproject  to  form  the  basis  (T,B)  on  the 
tangent  plane  of  surface  at  the  backprojection  of  point 
p.  We  will  write  the  backprojection  map  at  point  p]^ 
in  terms  of  the  bases  (tj.bi)  and  (Ti,Bi). 

We  assume  that  the  texture  is  constant  over  the 
surface,  so  the  transformation  between  points  smd 
P2  is  only  the  rotation,  by  some  angle  6t,  between 
the  two  bases  (Ti.Bj)  and  (T2,B2).  In  [13]  we  show 
that  the  rotation  between  the  bases  is 

St  =  (  )A<  +  (siniT-f-  cos  cr  cot  ff  rKt  cot  <r)Ab 

sinir 

(1) 

as  At,  Ah  — •  0.  The  texture  on  the  surface  undergoes 
rotation  by  —Sr- 

Next,  we  project  back  onto  the  viewing  sphere,  us¬ 
ing  the  matrix  •f"*(P2)-  Thb  puts  us  back  on  the 
viewing  sphere,  but  in  the  (t2,b2)  basis,  not  the  orig¬ 
inal  (tj.bj)  basis.  We  must  convert  between  these 
bases  by  rotating  by  the  angle  between  the  tilt  vec¬ 
tors,  6f  As  we  show  in  [13], 

St  = - — : - An- (-7^ — ){cos(T  +  rKi)Ab  (2) 

cos  (7  Sin  (7  sin  a 

as  At,Ab  —  0.  Thus  we  have  (see  [13]) 
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The  actual  affine  transformation  we  find  between 
the  two  points  will  not,  in  general,  be  in  terms  of  the 
(t,b)  basis.  Instead,  our  matrix  will  be  related  by 
a  change  of  basis:  A  =  UAU~^.  The  change  of  ba¬ 
sis  matrix,  U ,  rotates  the  standard  basis  to  the  (t,b) 
basis. 

The  analysis  above  assumes  that  we  have  spherical 
projection.  In  reality,  cameras  use  planar  projection. 
We  can  convert  between  the  two  types  of  projection 
by  applying  the  Jacobian  of  the  gaze  transformation 
[6,  13]. 

3  Shape  Recovery:  Algorithms  and 
Experimental  results 

To  estimate  the  teiiure  distoriion  map  at  a  point 
p,  we  find  the  spectrograms  for  that  point  and  for 
neighboring  points  in  a  number  of  different  directions, 
ij;  =  {Ati  Abi)'^ ,  around  p  and  get  estimates,  j4,  ,  of 
the  affine  transforms,  Ai  ,  for  each  of  these  directions 
using  the  algorithm  developed  in  the  previous  section. 
We  have  developed  two  algorithms  for  recovering  sur¬ 
face  orientation  (slant  and  tilt)  and  shape  (principal 
curvatures  and  directions).  We  will  present  experi¬ 
mental  results  on  a  number  of  images  of  planar  and 
curved  objects. 

The  first  shape  recovery  method  is  based  on  sin¬ 
gular  value  decomposition  (SVD)  of  the  Ai  matrices 
to  estimate  k\^  and  k'^.  By  solving  two  associated 
systems  of  linear  equations,  we  obtain  estimates  of 
the  slant,  <r,  the  tilt  direction  specified  by  6t,  and  the 
shape  parameters  tk,  and  rr  [12,  13].  Note  that  for 
the  systems  of  linear  equations  to  be  solvable,  it  is  nec¬ 
essary  and  sufficient  to  find  affine  transforms  in  two 
independent  directions. 

Table  1  shows  the  results  of  the  algorithm  on  seven 
images.  For  each  of  these  images,  we  found  the  affine 
transform  in  eight  different  directions  around  a  given 
point  of  the  image.  Each  of  these  images  was  created 
by  mapping  Brodatz  textures  on  various  surfaces.  The 
image  marked  “noise”  is  the  “wire”  image,  with  added 
noise  of  standard  deviation  30.  The  first  four  surfaces 
are  planar,  followed  by  a  cylinder  and  two  spheres. 
The  torsion,  r,  is  zero  for  all  of  the  examples. 

Complete  information  about  local  surface  shape  re¬ 
quires  knowledge  of  three  parameters,  and  here  we 
have  only  found  two:  tki  and  rr;  the  third  parame¬ 
ter  TKt  is  left  undetermined.  This  was  to  be  expected 
as  this  algorithm  is  essentially  based  on  factoring  the 
affine  transform  matrices  to  obtain  the  major  and  mi¬ 
nor  axis  gradients-we  know  from  Girding  that  these 


Figure  2:  Two  example  textured  surfaces. 


underspecify  shape 


True 

1  Estimated 

# 

e 

tilt 

rx, 

rxt 

1 

tilt 

rn, 

rr 

1 

69 
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64 
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56 
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3 

64 

-25 

0 

0 

57 
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4 

64 
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0 

0 

49 
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1.7 

5 

28 
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6.5 

0 

42 
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.49 

.03 

6 

39 
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66 

6.6 

80 

166 

-.26 

.46 

7 

40 

180 

6.6 

6.6 

79 

174 

QO 

.43 

Table  1:  True  and  Estimated  Surface  Parameters: 
Method  1 

For  the  planar  cases,  we  get  consistent  underesti¬ 
mation  of  slant,  but  in  general  the  results  are  good. 
The  algorithm  does  quite  well  even  in  the  presence 
of  noise.  We  get  good  results  on  the  “straw”  pla¬ 
nar  surface,  even  though  this  texture  does  not  satisfy 
the  isotropy  assumption  commonly  used  by  other  re¬ 
searchers.  The  tilt  and  some  of  the  slant  estimates 
for  the  curved  surfaces  were  also  reasonable,  but  the 
curvature  estimates  were  not  accurate  enough  to  be 
usable.  If  this  algorithm  were  to  be  used  for  shape 
estimation  by  itself,  the  recommended  strategy  would 
be  to  obtain  slant  and  tilt  estimates  at  a  large  num¬ 
ber  of  points  and  then  fit  a  smooth  surface  consistent 
with  these  estimates.  Differentiating  this  fitted  sur¬ 
face  yields  the  desired  shape  parameters. 

A  better  approach  is  to  use  our  second  algorithm 
which  gives  much  more  accurate  orientation  and  shape 
estimates  locally  without  any  need  for  a  surface  fitting 
stage.  The  second  algorithm  is  based  on  finding  the 
orientation  and  shape  parameters  which  minimize  the 
sum  of  squared  errors  between  the  predicted  and  em¬ 
pirically  measured  entries  of  the  affine  transformation 
matrices,  i.e.,  we  wish  to  minimize  the  following  error 


Table  2:  True  and  Estimated  Surface  Parameters 
Method  2 

function: 

n  2  2 

1=1  t=i  1=1 

where  Ai(kJ)  the  (t,/)th  element  of  the  theoretically 
predicted  matrix  Ai  and  is  a  function  of  the  shape  pa¬ 
rameters,  and  Ai(k,l)is  the  (Jb,  /)th  element  of  the  em¬ 
pirically  measured  affine  transform  matrix  Ai .  Ideally, 
each  term  in  this  error  sum  should  be  weighted  by  the 
inverse  of  the  standard  deviation  of  the  measurement 
error  of  that  particular  entry.  A  specific  characteriza¬ 
tion  of  the  probability  distribution  of  the  measurement 
errors  in  the  entries  of  the  affine  transform  matrices 
Ai  is  not  yet  available.  We  expect  it  to  depend  on 
the  particular  algorithm  used  for  the  estimation  of  the 
affine  transforms.  In  the  absence  of  a  particularly  ap¬ 
propriate  model,  we  will  proceed  on  the  (convenient!) 
assumption  that  these  errors  are  independent  and  nor¬ 
mally  distributed  with  standard  deviation  Aa. 

For  minimizing  the  error  function,  we  just  used  the 
gradient  descent  routine  in  the  Matbematica  package- 
there  are  any  of  a  number  of  variants  such  as  conjugate 
gradient,  Levenberg-Marquardt  that  could  have  been 
used  equivalently.  The  starting  point  is  provided  by 
the  orientation  cind  shape  estimate  returned  by  the 
first  algorithm.  Table  2  shows  the  results  for  the  same 
images  as  in  Table  1.  We  get  improved  slsint  and  tilt 
estimates,  and  also  significantly  better  estimates  of  the 
curvature  parameters.  We  will  now  determine  confi¬ 
dence  intervals  on  the  orientation  and  shape  estimates. 
For  more  details  on  the  methodology  that  we  follow, 
the  reader  is  referred  to  [15],  Chapters  14.4  and  14.5. 

Let  us  abbreviate  the  five  geometrical  parameters 
as  ji,*  =  1,2,  ...5.  To  obtain  con¬ 
fidence  intervals  on  the  parameters,  one  computes  the 
so-called  curvature  matrix  [o]  which  is  defined  as  half 


Table  3:  Error  Bounds  on  Tfue  Parameters 
of  the  Beaaian  of  the  function 

****  2Ao^  dgtdgi 

The  inverse  of  this  5x5  curvature  matrix  is  the 
covariance  matra,  C,  of  the  fit,  on  the  assumption 
of  independent,  identically  normally  distributed  er¬ 
rors  in  the  entries  of  the  affine  transformation  ma¬ 
trices.  In  that  case  the  confidence  intervals  for  the 
parameters  are  given  by  •ycii.  The  confidence  inter¬ 
val  for  parameter  gi,  ±6gi  is  ±y/Uii  for  68  percent 
confidence,  ±2\/CZ  for  95  percent  confidence.  In  Ta¬ 
ble  3  we  give  the  66  percent  confidence  intervals  for 
slant,  assuming*  a  measurement  error  An  of  standard 
deviation  0.0323.  Note  that  these  are  intervals  sur¬ 
rounding  the  true  parameters.  Using  this  value  for 
the  standard  deviation,  66  percent  of  the  all  of  the 
estimated  parameters  fall  within  the  68  percent  confi¬ 
dence  intervals  of  the  true  parameters. 

In  conclusion,  we  have  presented  a  method  for  find¬ 
ing  the  shape  of  surfaces  locally  from  texture  distor¬ 
tion,  modeled  as  a  set  of  affine  trsmsforms  in  differ¬ 
ent  directions  in  the  image.  The  advantage  of  this 
representation  is  that  it  captures  all  the  information 
available  locally  and  does  so  without  any  restrictive 
assumptions.  We  develop  a  differential  technique  for 
estimating  the  tdfine  transforms,  which  can  be  applied 
to  a  number  of  other  vision  problems. 

Our  results  demonstrate  that  local  shape-from- 
texture  without  any  a  priori  assumptions  on  the  tex¬ 
ture  is  a  viable  module  for  early  vision. 
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ground  truth  in  these  synthetic  examples. 
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Abstract.  Shape  from  texture  is  best  analyzed  in  two  stages,  analo¬ 
gous  to  stereopsis  and  structure  from  motion  (a)  Computing  the  ‘tex¬ 
ture  distortion’,  and  (b)  Interpreting  the  ‘texture  dutortion'  to  infer  the 
orientation  and  shape  of  the  surface.  We  model  the  texture  distortion 
for  a  given  point  and  direction  on  the  image  plane  as  an  affine  trans¬ 
formation  and  derive  the  relationship  between  the  parameters  of  this 
transformation  and  the  shape  parameters.  We  use  non-bnear  minimiza¬ 
tion  of  a  least  squares  error  criterion  to  estimate  the  shape  parameters 
from  the  affine  transformations,  using  a  simple  linear  algorithm  to  obtain 
an  initial  guess.  Under  the  assumption  that  the  measurement  errors  in 
the  affine  parameters  are  independent  and  normally  dutributed,  we  can 
find  error  bounds  on  the  shape  parameter  estimates.  We  present  results 
on  images  of  planar  and  curved  surfaces  under  perspective  projection. 
We  find  all  five  local  shape  and  orientation  parameters  with  no  a  priori 
assumptions  about  the  shape  of  the  surface. 


1  Introduction 

In  its  geometric  essence,  shape  from  texture  is  a  cue  to  3D  shape  very  similar 
to  binocular  stereopsis  and  structure  from  motion.  All  of  these  cues  are  based 
on  the  information  available  in  multiple  perspective  views  of  the  same  surface 
in  the  scene.  In  binocular  stereopsis,  the  two  eyes  get  slightly  different  views  of 
the  same  surface;  in  structure  from  motion,  the  relative  motion  of  the  observer 
and  the  surface  generates  the  different  views.  To  put  shape  from  texture  in 
this  framework,  consider  two  nearby  patches  on  a  surface  in  the  scene  with 
same  (or  sufficiently  similar)  texture.  The  appearances  of  the  two  patches  in  a 
single  monocular  image  will  be  slightly  different  because  of  the  slightly  different 
geometrical  relationships  that  they  have  with  respect  to  the  observer’s  eye  or 
camera.  We  thus  get  multiple  views  in  a  single  image. 

This  naturally  suggests  a  two  stage  framework  (1)  Computing  the  ‘texture 
distortion’  from  the  image,  and  (2)  Interpreting  the  ‘texture  distortion’  to  infer 
the  orientation  and  shape  of  the  scene  surface  in  3D.  The  ‘texture  distortion’  is 
the  counterpart  in  texture  analysis  of  binocular  disparity  in  stereopsis  or  optical 
flow  in  structure  from  motion.  We  believe  that  the  natural  way  to  model  the 


texture  distortion  locally  is  as  a  2- D  affine  transformation  between  neighboring 
image  patches.  This  affine  transformation  will  depend  on  the  direction  and  mag¬ 
nitude  of  the  vector  displacement  between  the  two  patches  in  the  image.  We 
find  the  affine  transformations  between  two  image  patches  using  a  differential 
method  (see  [12,  13]).  We  will  call  the  map  which  associates  to  each  direction  in 
the  image  plane  an  affine  transformation,  the  ieziurt  dt$tortion  map.  For  each 
point  on  a  smoothly  curved  textured  surface  this  map  is  well  defined  and  can  be 
related  to  surface  shape  and  orientation  with  respect  to  the  viewer. 

In  Sect.  3  we  derive  the  relationship  between  the  texture  distortion  map 
and  the  surface  orientation  (slant  and  tilt)  and  shape  (principal  curvatures  and 
directions).  This  derivation  makes  use  of  previous  results  due  to  Girding  [8]  and 
builds  on  our  previous  derivation  in  [12] 

In  Sect.  4  of  the  paper,  we  develop  a  new  algorithm  for  recovering  surface 
orientation  and  shape  based  on  the  estimated  affine  transforms  in  a  number  of 
different  directions.  The  method  uses  nonlinear  minimuation  of  a  least  squares 
error  criterion  to  estimate  the  shape  parameters.  We  use  a  simple  linear  algo¬ 
rithm  based  on  singular  value  decomposition  of  the  linear  parts  of  the  affine 
transforms  to  find  the  initial  conditions  for  the  minimization  procedure  [13]. 
This  linear  algorithm  is  a  slight  modification  of  the  work  we  described  in  [12]. 
and  we  will  not  describe  it  in  this  paper. 

Our  shape  estimation  algorithm  is  arguably  optimal  in  a  maximum  likelihood 
sense  if  the  measurement  errors  in  the  affine  parameters  can  be  assumed  to  be 
independent  and  normally  distributed.  By  studying  the  Hessian  of  the  error 
function  at  the  minimum  point,  one  can  characterize  the  confidence  intervals 
of  the  shape  estimates.  We  present  simulation  results  on  a  number  of  examples 
of  planar  and  curved  surfaces.  Finally,  we  will  discuss  predictions  for  human 
perception  of  shape  from  texture. 


2  Relationship  to  Previous  Work 

W’e  review  previous  shape  from  texture  research  in  detail  in  [12,  13].  Much  of 
the  previous  research  in  shape  from  texture  has  assumed  either  a  homogeneous 
(i.e.  constant  area  or  density)[9,  1,  10,  17,  14],  or  an  isotropic  texture[18,  6,  5,  4] 
in  the  scene.  The  isotropy  assumption  will  be  incorrect  for  many  textures  such 
as  grass,  fabric,  and  bricks.  Both  of  these  assumptions  allow  one  to  make  only 
partial  use  of  the  available  information;  under  the  homogeneity  assumption  one 
cannot  make  use  of  the  change  in  shape  of  the  texture  elements,  tmd  under  the 
isotropy  assumption  one  cannot  make  use  of  the  chtinge  in  size  of  the  elements. 
Obviously,  some  assumption  about  the  texture  is  necessary,  otherwise  what  we 
are  seeing  could  in  fact  just  be  a  particular  pattern  of  reflectance  changes  on 
a  flat  surface  (as  in  a  Renaissance  painting).  We  will  assume  that  the  texture 
is  the  same  at  different  points  on  the  surface  in  the  scene.  While  this  implies 
periodicity  for  a  deterministic  pattern,  for  a  texture  which  is  best  thought  of  as 
a  realization  of  a  stochastic  process  we  can  formalize  this  as  stationarity  under 
translations.  The  term  homogeneity  is  used  in  the  probability  and  statistical 


literature  as  equivalent  to  stationarity  under  translations.  A  stochastic  process 
is  specified  by  giving  the  joint  distributions  of  any  finite  subsets  of  the  variables. 
The  thing  to  note  here  is  that  we  can  assume  that  not  only  the  first  but  also 
the  second  (and  higher)  order  statistics  are  translation-invariant.  This  is  more 
powerful  than  assuming,  as  in  previous  use  of  homogeneity  in  the  computer 
vision  literature,  that  just  the  first  order  statistics  (e.g  fraction  of  surface  area 
occupied  by  texels)  are  translation  invariant.  We  will  be  able  to  exploit  changes 
in  shapes  of  texture  elements  as  well. 

In  the  Sect.  3  we  will  relate  the  parameters  of  an  affine  transformation  be¬ 
tween  two  image  patches  to  five  of  the  local  shape  and  orientation  parameters: 
slant,  tilt,  and  three  curvature  parameters  Section  4  presents  the  algorithm 
for  estimating  all  five  shape  parameters,  with  results  on  synthetic  and  real  im¬ 
ages.  To  our  knowledge,  this  is  the  first  time  that  direct  estimation  of  curvature 
parameters  from  textured  images  has  been  demonstrated.  While  Garding’s  [8] 
analysis  dealt  with  general  curved  surfaces,  his  algorithms  for  estimating  shape 
from  distortion  gradients  permit  only  the  computation  of  slant  and  tilt. 

3  Relationship  Between  the  Texture  Distortion  Map  and 
3D  Shape 

In  this  section  we  will  develop  the  relationship  between  the  parameters  of  the 
affine  transformation  between  a  pair  of  images  patches  and  the  surface  shape 
and  pose.  We  use  perspective  projection  to  a  spherical  image  surface  instead  of 
to  a  planar  surface.  While  there  is  a  1-1  mapping  which  relates  the  two  kinds 
of  perspective  projection,  known  as  the  gaze  transformation,  the  relations  which 
follow  turn  out  to  be  simpler  in  the  spherical  case  (9,  8]. 

This  section  has  two  parts.  The  first  part  is  a  review  of  the  formalism  de¬ 
veloped  by  Garding[8]-essentially  he  defines  an  orthonormal  frame  field  on  the 
image  sphere  with  one  of  the  vectors  in  the  tilt  direction.  The  backprojection 
map  takes  on  a  particularly  simple  form,  and  he  is  able  to  obtain  expressions 
for  the  different  texture  gradients  for  the  general  situation  of  smooth  curved 
surfaces  under  perspective  projection. 

In  the  second  part,  we  exploit  this  frame  field  to  derive  an  expression  for  the 
affine  transformation  on  the  image  sphere  which  relates  two  neighboring  image 
patches. 


3.1  The  slant-tilt  frame  field 

This  section  is  based  on  Girding  [8]  to  which  the  reader  is  referred  to  for  proofs 
of  the  various  assertions.  Relevant  differential  geometry  concepts  may  be  found 
in  O’Neill  [15]. 

The  basic  geometry  is  illustrated  in  Fig.  1.  A  smooth  surface  S  is  mapped 
by  central  projection  to  a  unit  sphere  E  centered  at  the  focal  point.  The  back- 
projection  map  F  from  27  to  S  is  defined  as  F(p)  =  r(p)  =  r(p)p  where  p  is  a 
unit  vector  from  the  focal  point  to  a  point  on  the  image  sphere,  and  r(p)  is  the 


distance  along  the  visual  ray  from  the  focal  point  through  p  to  the  correspond¬ 
ing  point  r  =  F(p)  on  the  surface  S.  We  consider  this  map  for  regions  of  the 
surface  where  the  map  is  not  singular  by  excluding  neighborhoods  containing  the 
occluding  contour.  The  derivative  of  the  backprojection  map  F,  maps  tangent 
vectors  of  iT  at  p  to  tangent  vectors  of  S  at  F{p). 


Viewing  sphere  £  Surfiue  S 


Define  the  tilt  direction  t  in  Tp{L),  the  tangent  plane  of  the  viewing  sphere 
at  p,  to  be  a  unit  vector  in  the  direction  of  the  gradient  of  the  distance  function 
r(p),  and  the  auxiliary  vector  b  =  p  x  t.  Then  (t.  b)  form  an  orthonormal  basis 
for  the  tangent  plane  to  the  image  sphere  iT  and  together  with  p  constitute 
an  orthonormal  frame  field  on  L.  Carding  shows  that  t  and  b  backproject  to 
orthogonal  vectors  F.(t)  =  r'p  +  rt  and  F.(b)  =  rb  in  the  tangent  space 
Dividing  these  vectors  by  their  lengths  gives  us  an  orthonormal  basis 
(T,  B)  of  the  tangent  space  of  the  surface  at  F(p).  The  vectors  T,  B  along  with 
the  unit  normal  to  the  surface  N  =  T  x  B  constitute  tin  orthonormal  frame  field 
on  the  surface.  The  slant  angle  cr  is  defined  to  be  the  angle  between  the  surface 
normal  N  and  the  viewing  direction  p,  so  that  costr  =  N.p. 

The  shape  of  the  surface  is  captured  in  the  shape  operator,  which  measures 
how  the  surface  normal  N  changes  as  one  moves  in  various  directions  in  the 
tangent  space  of  the  surface  Tf(p)(5).  One  can  represent  the  shape  operator  in 
the  (T,  B)  basis  as 


-VjN 

-VbN 


(P)  = 


K,  T 
T  Ki 


T 
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(1) 


where  is  the  normal  curvature  in  the  T  direction,  xj  the  normal  curvature 
in  the  B  direction  and  r  is  the  geodesic  torsion.  The  determinant  of  the  operator 
gives  the  Gaussian  curvature  K  =  KiKi,  —  r^,  and  half  the  trace  is  the  mean 
curvature  /f  =  (x,  +  xj)/2. 

Carding  goes  on  to  obtain  expressions  for  the  derivatives  of  the  (p,  t,  b)  frame 
field  on  F'  expressed  in  terms  of  the  frame  field  itself.  One  can  also  define  the 
derivatives  of  the  frame  field  (N,T,B)  on  S  with  respect  to  (p,t,b)  by  first 
pulling  back  these  fields  from  the  surface  S  to  E. 

Using  the  linearity  of  the  derivative,  we  can  compute  Vyt  for  an  arbitrary 


(2) 


vector  V  =  +  /\6b  in  the  tangent  space. 

^vt  =  —At  p  + - : - At  b  +  — ){cos  (T  +  TKAAbh 

cos  <T  sin  ff  sin  cr 

where  r  is  the  distance  to  the  object  from  the  center  of  the  viewing  sphere. 

In  the  next  section,  we  will  also  need  the  derivative  of  the  T  vector  field. 
^\e  can  compute  VyT  for  an  arbitrary  vector  v  =  Att+  Abb  in  the  tangent 
space[13]: 

^vT  =  ( - At  +  rrzl6)N  +  (-: - )At  B  + - (1  +  rx*  co6ff)Ab'B  (3) 

cos  a  sin  cr  sin  e  '  ' 

3.2  Affine  Transformations  on  the  Image  Sphere 

Figure  2  depicts  the  situation.  We  wish  to  find  the  matrix.  A,  which  represents 
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Fig.  2.  Determining  the  affine  transformation.  A,  between  the  texture  at 
point  Pi  and  the  texture  at  point  pj. 


the  affine  transformation  between  the  spherically  projected  texture  at  point  pi 
and  the  projected  texture  at  a  nearby  point  p2.  This  matrix  will  be  a  function 
of  the  local  orientation  and  shape  parameters.  The  orientation  parameters  are 
(T.  the  slant  of  the  surface,  and  t,  the  direction  of  tilt  of  the  surface.  The  shape 
parameters  are  r«,,  r«6,  and  rr.  The  variable  r  is  the  distance  from  the  center  of 
the  viewing  sphere  to  the  given  point  on  the  surface.  Note  that  the  inseparability 
of  the  distance,  r,  and  the  curvature  parameters  is  inherent  to  the  problem.  The 
image  of  a  surface  S  at  distance  r  is  indistinguishable  from  that  of  a  it  scaled 
copy  (for  which  the  curvatures  will  be  1/it  of  S)  at  a  distance  of  kr. 

Our  analysis  is  a  differential  analysis.  We  will  freely  assume  that  P2  —  pj 
can  be  modelled  as  a  vector  v  =  Att  +  ^6b  in  the  tangent  space  at  the  point 
Pi  and  the  expressions  derived  will  be  true  in  the  limit  as  At,  Ab  — >  0. 

To  find  the  affine  transformation,  we  first  backproject  from  the  point  pi  on 
the  viewsphere  to  the  corresponding  point  Pj  on  the  surface,  using  the  map 
F.(pi).  Using  the  basis  {ti,bi)  on  the  tangent  plane  of  the  image  sphere  L, 
and  (Tj ,  Bi)  on  the  tangent  plane  on  the  surface  S,  this  map  can  be  represented 
as 


r/  cos  ff  0  _ 
0  rj  ~ 
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m, 
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F.(Pl)  = 


We  see  that  mj  is  the  scaling  of  the  texture  pattern  in  the  “minor  ajcis,”  i.e.  in 
the  tilt  direction,  due  to  projection,  and  A/i  is  the  scaling  in  the  “major  axis.” 

We  Eissume  that  the  texture  is  constant  over  the  surface,  so  the  transformation 
between  points  Pj  and  P2  is  only  the  rotation,  by  some  angle  &t,  between  the 
two  bases  (Ti,Bi)  and  (T2,Bj).  To  find  this  angle,  we  begin  by  noting  that 
T2  =  Tj  +  VvT  to  first  order,  and  hence  from  equation  3 

To  =  Ti Bi -f  — ^(l-^r»t»cos(T)^6Bi  (4) 
^  ^  cos  a  *  sin  (T  *  sin  <t 

In  the  right  hand  side  of  this  equation,  the  term  in  the  direction  of  the  surface 
normal  represents  a  change  of  the  plane  of  the  frame  as  a  whole,  and  the 
terms  in  the  direction  of  Bj  represent  a  rotation  about  the  surface  normal  on 
S.  Since  T,  B  are  unit  vectors,  we  see  that 


)At  -I- — ( 1 -*•  ric*  cosiTjJit  (5) 

sin  0  sin  e 

as  At.Ab  — •  0  Note  that  if  the  T,B  basis  vectors  undergo  counterclockwise 
rotation  by  6t,  the  texture  on  the  surface  undergoes  rotation  by  —&x'- 

Next,  we  project  back  onto  the  viewing  sphere,  using  the  matrix  ^“'(^2) 
This  puts  us  back  on  the  viewing  sphere,  but  in  the  (t2.b2)  basis,  not  the 
original  (t]^,b2)  basis.  We  must  convert  between  these  bases  by  rotating  by  the 
angle  between  the  tilt  vectors,  i|.  As  in  the  case  of  the  T  vector  field,  to  find 
this  angle  we  begin  by  noting  that  t2  =  ti  +  ^vtto  first  order.  Hence  from 
equation  2  we  get 


*2  =  *1  -  Pi  + 


TT  1 

- .^(bi+( - )(cos<r -i- rict)i^6bi  (6) 

cosirsinir  *  sintr 


As  before,  the  term  in  the  direction  of  pj  represents  a  change  of  the  plane  of 
the  frame  as  a  whole,  and  the  term  in  the  direction  of  bj  represents  a  rotation 
of  the  frame  about  the  normal.  We  obtain 


6t  =  - ^ - At  +  {-r^^ — ){cos  tr  +  rKi)Ab 

cos  (T  sin  <7  sin  tr 


(7) 


as  At,  Ab  0.  Thus  we  have 


A  =  Rot(<5,)/’.  '(p2)Rot(-67-)F.(pi)  =  Rot(i,)  ■ 

Girding  showed  that  the  normalized  gradients  of  the  minor  and  major  axis 
scale  factors,  in  the  (t,b)  basis,  are 
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m2  =  TTii  +  Vm  o  (zlt  Ab)^ 
M2  =  Ml  +  VM  o  {At  Abf 


(9) 


We  note  that 


to  first  order  where  P2  -  Pi  =  Ab)"^  is  the  step  between  the  points  pi 

and  p2  in  the  image.  Using  this  equation  and  equation  8,  we  get 


A  =  Rot(6i)  • 
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The  actual  affine  transformation  we  find  between  the  two  points  will  not, 
in  general,  be  in  terms  of  the  (t,b)  btisis.  Instead,  our  matrix  will  be  related 
by  a  change  of  basis:  A  =  UAU~^.  The  change  of  basis  matrix,  f,  rotates  the 
standard  basis  to  the  (t,b)  basis  and  is  given  by  Rot{fi|),  where  0,  is  the  tilt 
angle  at  point  pi. 

The  analysis  above  assumes  that  we  have  spherical  projection  In  reality, 
cameras  use  planar  projection.  We  can  convert  between  the  two  types  of  pro¬ 
jection  by  applying  the  Jacobian  of  the  gaze  transformation  [7].  For  each  of  the 
examples  in  this  paper  the  gaze  transformation  was  insignificant,  so  we  ignored 
it. 


4  Shape  Recovery  Algorithm  and  Experimental  Results 

In  this  section,  we  develop  a  new  algorithm  for  recovering  surface  orientation 
(slant  and  tilt)  and  shape  (principal  curvatures  and  directions),  with  an  associ¬ 
ated  sensitivity  analysis.  W'e  will  also  present  experimental  results  on  a  number 
of  images  of  planar  and  curved  objects,  and  discuss  predictions  for  human  per¬ 
ception  of  shape  from  texture. 

There  are  five  unknowns:  the  slant  <r,  the  tilt  direction  specified  by  0i,  and 
the  three  shape  parameters  (r/c,,  tkj,  rr).  Each  estimation  of  an  affine  transform 
in  an  image  direction  vj  =  (Ati  AbiY  yields  four  nonlinear  equations.  Simple 
equation  counting  tells  us  that  one  direction  is  not  enough,  and  generically  two 
directions  ought  to  be  sufficient. 

Our  shape  recovery  algorithm  is  based  on  finding  the  orientation  and  shape 
parameters  which  minimize  the  sum  of  squared  errors  between  the  predicted  and 
empirically  measured  entries  of  the  affine  transformation  matrices,  i.e.,  we  wish 
to  minimize  the  following  error  function: 

n  2  2 

X'2(cr,  01,  TK,,  r/cj,  '■r)  =  ^  ^  /)  -  Ai{k,  l)f 

t=l  i=l  1=1 


where  Ai{k,l)  the  (fc,/)th  element  of  the  theoretically  predicted  matrix  Ai  and 
is  a  function  of  the  shape  parameters,  and  Ai{k,l)  is  the  (it,/)th  element  of  the 
empirically  measured  affine  transform  matrix  Ai.  Ideally,  each  term  in  this  error 
sum  should  be  weighted  by  the  inverse  of  the  standard  deviation  of  the  measure¬ 
ment  error  of  that  particular  entry.  A  specific  characterization  of  the  probability 


Table  1.  True  and  Estimated  Surface  Parameters 
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0 

40 
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002 

sph2 

40 
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6  6 

0 

35 
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8  0 
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95 

002 

distribution  of  the  measurement  errors  in  the  entries  of  the  affine  transform  ma¬ 
trices  Ai  is  not  yet  available.  We  expect  it  to  depend  on  the  particular  algorithm 
used  for  the  estimation  of  the  affine  transforms.  In  the  absence  of  a  particularly 
appropriate  model,  we  will  proceed  on  the  (convenient!)  assumption  that  these 
errors  are  independent  and  normally  distributed  with  standard  deviation  Aa. 

For  minimizing  the  error  function,  we  just  used  the  gradient  descent  routine 
in  the  Mathematica  package-there  are  any  of  a  number  of  variants,  such  as 
conjugate  gradient,  that  could  have  been  used  equivalently  We  obtain  an  initial 
guess  for  the  shape  parameters  (ff,ff,,rKi,rr)  using  a  linear  algorithm  based  on 
the  singular  value  decomposition  of  the  A,  matrices  [13].  We  initially  set  rxt 
equal  to  the  initial  value  of  rK). 

Table  1  shows  the  results  on  a  number  of  synthetic  examples.  The  image 
“noise”  is  the  “wire”  image,  with  added  noise  of  standard  deviation  30.  The  first 
four  surfaces  are  planar,  followed  by  a  cylinder  and  two  spheres.  We  also  ran 
our  algorithm  on  two  real  images.  Figure  4  shows  two  of  the  synthetic  images, 
as  well  as  two  natural  images.  We  get  better  slant  and  tilt  estimates  than  that 
provided  by  our  simple  linear  algorithm,  and  also  significantly  better  estimates 
of  the  curvature  parameters  [13].  The  algorithm  does  quite  well  even  in  the  pres¬ 
ence  of  noise.  We  get  good  results  on  the  “straw”  planar  surface,  even  though 
this  texture  does  not  satisfy  the  isotropy  assumption  commonly  used  by  other 
researchers.  Since  we  do  not  have  ground  truth  for  the  natural  images,  we  in¬ 
dicate  the  computed  surface  orientation  by  a  projected  circle  in  the  image,  and 
give  the  shape  estimates  in  the  captions.  We  get  quite  reasonable  orientation 
estimates,  and  believable  curvature  estimates,  for  both  of  the  natural  images. 

We  will  now  discuss  the  determination  of  confidence  intervals  on  the  orienta¬ 
tion  and  shape  estimates.  For  more  details  on  the  methodology  that  we  follow, 
the  reader  is  referred  to  [16],  Chapters  14.4  and  14.5. 

Let  us  abbreviate  the  five  geometrical  parameters  (er,  0,,  tk,,  rKi,  rr)  as  j,,  t  = 
1, 2, . .  .5.  The  gradient  of  with  respect  to  the  parameters  g  will  be  zero  at  the 
X^  minimum.  To  obtain  confidence  intervals  on  the  parameters,  one  computes 
the  so-called  curvature  maim  [a]  which  is  defined  as  half  of  the  Hessian  of  the 
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a  =  68,  e,  =  -96,  TK,  =  -0.08,  tki,  =  0.07,  rr  =  -0.06 


Left  pt;  <r  =  19,  6,  =  -8,  tk,  =  0.53,  tk,,  =  -0.12,  tt  =  0.00 


Right  pt:  a  =  40,  6,  =  -11,  r/s,  =  2.9,  tk^  =  -0.03,  rr  =  0.58 
Fig.  3.  ExperimentaJ  images. 


function 

=  _i__£!iL 

2Aa^  dgtdgi 

Since  we  have  five  geometrical  shape  parameters  gi,  this  is  a  5  x  5  symmetric 
matrix.  The  inverse  of  the  curvature  matrix  is  the  covariance  matrix,  C,  of  the 
fit,  on  the  assumption  of  normally  distributed  errors.  In  that  case  the  st2mdard 
errors  in  the  parameters  are  given  by  y/C,,.  The  confidence  interval  for  parameter 
gi,  ±6g,  is  ±\/Cii  for  68  percent  confidence.  ±2y/U^  for  95  percent  confidence. 

As  mentioned  above,  if  we  assume  that  the  errors  in  the  entries  of  the 
affine  transformation  matrices  A,  are  independent  and  identically  normally  dis¬ 
tributed.  and  if  we  have  an  estimate  for  the  standard  deviation  of  these  errors, 
we  can  give  confidence  intervzds  for  the  estimates  of  the  shape  parameters.  We 
have  calculated  the  68  percent  confidence  intervals  for  the  parameters,  assuming' 
a  measurement  error  Aa  of  standard  deviation  0.0323.  Using  this  value  for  the 
standard  deviation,  66  percent  of  the  estimated  parameters  for  the  synthetic 
examples  fall  within  the  68  percent  confidence  intervals. 

In  addition  to  obtaining  confidence  intervals  for  our  empirical  shape  esti¬ 
mates,  we  can  use  the  theory  outlined  above  to  develop  an  ideal  observer  model 
for  shape  from  texture.  Roughly  speaking,  an  ideal  observer  gives  us  a  predic¬ 
tion  for  the  best  performance  one  expects  out  of  any  estimator,  given  the  visual 
information  and  the  measurement  error.  As  such  it  is  of  interest  both  for  com¬ 
puter  vision  shape  from  texture  algorithms  and  for  predicting  the  performance 
of  the  human  visual  system.  In  the  context  of  shape  from  texture  models  based 
on  discrete  texels  using  isotropy  and  first  order  homogeneity  assumptions,  such 
ideal  observers  have  been  developed  by  Blake  et  al[3]. 

The  uncertainty  in  the  shape  estimates  depends  upon  the  measurement  errors 
in  the  earlier  stages  of  processing;  here,  in  the  estimation  of  the  affine  transforms. 
The  errors  in  the  affine  transforms  will  of  course  depend  on  the  texture  being 
viewed.  For  example,  if  the  texture  is  a  reedization  of  a  Poisson  process  then 
we  expect  that  for  sparser  textures  it  will  be  more  difficult  to  estimate  the 
affine  transforms  accurately.  Without  prior  knowledge  of  the  distribution  of  the 
texture,  however,  what  can  we  say  about  the  distribution  of  errors  in  the  affine 
transforms?  A  simple  choice  would  be  to  assume,  as  we  did  previously,  that 
the  errors  in  the  affine  transform  pt^ameters  are  independent  and  identically 
distributed  normal  random  variables.  While  clearly  this  assumption  would  be 
incorrect  for,  e.g.,  highly  anisotropic  texture,  it  gave  us  very  reasonable  results 
for  the  range  of  textures  earlier  in  this  section.  One  may  think  of  the  situation 
as  follows:  we  have  a  number  of  surfaces  with  different  shapes  and  orientations, 
all  with  similar  textures  which  satisfy  the  above  assumption,  and  we  wish  to 
know  whether  shape  estimation  is  more  difficult  for  some  of  these  shapes  than 
others.  As  above,  we  will  find  confidence  intervals  for  the  shape  parameters  for  a 


’  We  computed  this  value  as  the  standard  deviation  of  the  errors  in  the  affine  transfor¬ 
mation  matrices  for  the  examples  used  in  this  paper.  The  errors  are  known  because 
we  know  the  ground  truth  in  these  synthetic  examples. 


nunib»=r  of  these  hypothetical  shapes.  The  confidence  intervals  give  us  a  measure 
of  the  uncertainty  in  the  shape  estimates. 

^Ve  present  here  the  results  of  two  sample  ideal  observer  “experiments."  For 
further  “experiments,”  see  [13].  Since  the  confidence  intervals  will  depend  on  the 
measurement  error  in  the  elements  of  the  affine  transformations,  we  give  the  con¬ 
fidence  intervals  in  terms  of  relative  units.  To  obtain  the  actual  expected  error 
one  would  multiply  these  values  by  the  standard  deviation  of  the  measurement 
error.  In  the  first  experiment  in  Fig.  4a,  we  varied  the  slant  of  a  planar  surface, 
and  plot  the  confidence  intervals  for  tilt  estimates  We  see  that  we  expect  im¬ 
proved  tilt  estimates  for  higher  slants.  Blake,  et  al[3]  report  similar  results  for 
their  ideal  observer  based  on  compression  gradient  In  the  second  experiment, 
we  varied  the  slant  of  a  cylindrical  surface,  keeping  the  tilt  perpendicular  to  the 
axis  of  the  cylinder.  This  amounts  to  computing  shape  estimates  for  a  series  of 
points  along  the  circumference  of  the  cylinder.  In  Fig  4b  we  plot  the  confidence 
intervals  for  the  estimate  of  r/c,.  Note  that  for  smaller  slants  we  expect  a  more 
error  in  the  curvature  parameter.  This  may  explain  much  of  the  error  in  rK, 
for  both  our  synthetic  cylinder  example  and  for  the  first  point  in  the  real  cylin¬ 
der  example.  Observations  such  as  these  suggest  several  lines  of  psychophysical 
investigation. 


Fig.  4.  Ideal  observer  experiments,  described  in  text. 
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ABSTRACT 

Previous  conflicting  reports  concerning  fully-depleted  SOI 
device  hot  electron  reliability  is  partially  due  to 
misunderstanding  over  the  maximum  channel  electric  field 
(E^).  Experimental  results  using  SOI  MOSFETs  with  body 
contacts  indicate  that  E^,  is  just  a  weak  function  of  thin-film 
SOI  thickness  (Tjj)  and  E^  can  be  significantly  lower  than  in  a 
bulk  device  with  drain  junction  depth  (Xj)  comparable  to  Tj,. 
The  theoretical  correlation  between  SOI  MOSFET s  gate  current 
and  substrate  current  are  experimentally  confirmed.  This 
provides  a  means  (I^)  of  studying  E^,  in  SOI  device  without 
body  contacts.  Both  N-  and  P-MOSFt  Is  can  have  better  hot- 
carrier  reliability  than  comparable  bulk  devices.  Thin  film  SOI 
MOSFETs  have  better  prospects  for  meeting  breakdown  voltage 
and  hot-electron  reliability  requirements  than  previously 
thought. 


1.  INTRODUCTION 

Thin-film  fully-depleted  (FD)  SOI  MOSFETs  have 
attracted  much  attention  because  of  their  large  drain  saturation 
cument,  absence  of  kink  effect,  and  superior  subthreshold 
leakage.  However,  as  far  as  device  reliability  and  breakdown 
voltage  are  concerned,  previous  reports  are  divided  on  whether 
FD  SOI  devices  have  reduced  or  enhanced  hot-carrier 
susceptibility  and  sensitivity  to  SOI  film  thickness  [l]-[7].  The 
main  difficulty  is  that  substrate  current,  (IgUB^-  convenient 
monitor  of  channel  field  for  bulk  MOS  devices,  cannot  be 
measured  on  the  SOI  devices  with  floating  body.  While  gate 
current  has  been  suggested  as  a  lifetime  parameter  for  SOI  N- 
MOS  devices  [6],  gate  current  is  difficult  to  measure  and  gate 
current  has  not  been  confirmed  as  a  valid  monitor  of  channel 
field  without  substrate  current  [7].  In  this  study,  for  the  first 
time,  using  a  special  SOI  device  structure  with  body  contact, 
both  substrate  (body)  current  GsuB^  current  (Iq)  were 

directly  measured  in  the  same  devices.  Based  on  this 
experiment,  not  only  is  the  channel  field  in  SOI  devices 
quantified,  but  also  the  correlation  between  Ijub  ** 

established.  Finally,  the  hot  carrier  degradation  lifetimes  of  SOI 
N-  and  P-MOSFETs  are  compared  with  those  of  bulk  devices. 


II.  EXPERIMENTS 

FD  N-  and  P-channel  SOI  devices  were  fabricated 
SIMOX  wafers  using  a  CMOS  process.  The  buried  oi 
thickness  is  about  3500  A.  LOCOS  isolation  and  3000  A  in- 
doped  N*  poly  gate  were  used.  To  collect  IjuB- 
devices  have  a  special  P*  region  to  contact  the  P-type  bod) 
illustrated  in  Fig.l.  The  P*  region  was  formed  by  P-MOS  : 
implant  (Bij.  4x10^^  cm'^,  30  keV).  Similarly,  P-M 

devices  have  an  N*  region  to  contact  N-type  body  using  N-M 
S/D  implant  (As,  4xl0'^  cm*^,  70  keV).  The  measured  15^5 
these  structures  is  found  to  be  proportional  to  the  channel  wic 
indicating  a  high  efficiency  in  collecting  Ijub- 


Fig.l  A  schematic  of  the  body  contact  in  N-MOS  SC 
devices.  SOI  P-MOS  devices  have  the  similar  structui 
with  N'^  and  P'*'  regions  exchanged. 


in.  RESULTS  AND  DISCUSSIONS 

(A)  Analytical  Hot-Carrier  Models 

In  bulk  MOSFETs,  the  maximum  channel  field  car 
estimated  as  follows  [8]: 


CH3332-4/94/0000-0052$01 .00  *  1994  lEEE/lRPS 
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Em  =(Vd -Vdsat)/^ 

f  =  0.22Tox'^^Xj'''2  (2) 

i  where  X:  is  the  drain  junction  depth  and  t  is  the  characteristic 
length.  The  hot-carrier  currents,  Ijjjg  and  Iq,  can  be  expressed 
{  as  follows  [9]: 

I 


ISUB  =-^('^D-''^DSAT)Id«P(—^) 

Bj  t-m 

where  Aj  and  Bj  are  known  constants  for  impact  ionization  rate 
of  channel  carriers  [9].  Bj  is  around  1.7x10^  V*cm'*  for 
electrons  and  3.7x10^  V  •  cm’’  for  holes  [10]. 


lG  =  Ci(Eox)lDe>^P(-7^)  forN-MOS.  (4) 
Em^e 

1g  =  C2  (Eox  )ISUB  exp(-^)  for  P-MOS.  (5) 

where  (pt,  is  the  barrier  height  at  Si/Si02  interface  for  electron, 

JLj  is  the  scattering  mean-free  path  of  electron.  It  is  worth 
noting  that  the  gate  current  of  P-MOSFET's  results  from 
electron  injection  rather  than  hole  injection  into  the  oxide  in  the 
1ow-Vq  regime  [11]. 

From  Eqs.  3,  4,  and  5,  we  find: 


In 


SUB 


“^’dsat) 


(6) 


DSAT 


<Pb 

Iq  _  C(Eo^  )Bi - (isyB.)BiXj  forN-MOS  (7) 

AjjVQ-VpsAT) 


Tjj  as  a  substitute  for  X.  can  overestimate  Em  by  a  factor  of  2 
for  both  N-  and  P-MOS  devices  (roughly  equivalent  to 
overestimating  Vq  by  more  than  50%). 


«Pb 

Iq  ^  CCEoJBj _ (lsUB.)B,>.t  fo^P-MOS.  (8) 

^SUB  AjCVo  -  VqsaT  ) 


(B)  Hot-Carrier  Currents  and  Channel  Electric  Field 

In  SOI  devices,  whether  Eq.  1  and  Eq.-2  can  be  used  and 
how  are  not  clear.  Colinge  [3]  used  Tjj  as  a  substitute  for  Xj  in 
Eq.2  to  estimate  the  channel  field,  while  Chen,  et  al.  [6] 
reported  that  this  overestimates  Em-  From  Eq.  6,  a  plot  of 

'"(Isub/(Id(Vd-Vdsat)))  i/(Vd-Vdsat) 

one  straight  line  for  all  bias  voltages,  for  both  N-  and  P- 
MOSFETs,  as  shown  in  Fig.2.  The  slope  of  this  straight  line 
lives  Bjf,  from  which  f  and  hence  Em  can  be  determined 
experimentally.  It  is  found  that  using  thin  film  SOI  thickness 


Fig.2  Experimental  determination  of  the  characteristic 

length  (  in  Em  =  (^D  ~^DSAT ^ 
devices,  (a),  N-MOS  devices:  For  bulk  device,  the 
measured  t  is  461  A.  For  SOI  device,  the  measured  €  s 
are  1194  and  1254  A,  while  the  €  s,  based  on  the  bulk 
JdOSFETE„  model  with  Xj=Tji,  are  554  A  and  616  A.  (b), 
P-MOS  devices:  For  bulk  device,  the  measured  £  is  426  A. 
For  SOI  device,  the  measured  £  s  are  1213  and  1271  A, 
while  the  £  s,  based  on  the  bulk  MOSFET  E„  model  with 
Xj=Tji.  are  554  A  and  616  A.  The  data  clearly  show  that 
in  SOI  devices  can  be  much  lower  than  in  bulk  devices. 

It  can  be  deduced  from  Eq.3  that  Ijub^D  *  simple 
monitor  of  the  maximum  channel  field  Em-  Fig.3(a)  and  (b) 
show  the  distribution  of  measured  Isub^D  N-MOS  SOI 
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and  bulk  devices,  respectively.  Although  Tjj  variation  across  a 
wafer  is  as  high  as  200  A,  the  IgUB^D  variation  is  only  ±  10% 
and  comparable  to  the  ±10%  variation  in  the  bulk  case, 
confirming  the  weak  dependence  on  Tjj.  Besides,  SOI 
devices  have  much  lower  Isub/Id,  hence  E^,  by  a  factor  of  4 
than  comparable  bulk  devices  (see  Fig3(a)  and  (b)),  although 
Vdsat  about  the  same.  Tjj  was  determined  by  the  CV 
technique  [12], 


thinner  gate  oxide.  Since  the  gate  current  in  P-MOS  devices 
is  much  larger  as  compared  to  that  in  N-MOS  devices  thus 
easier  to  be  measured,  the  Iq/Id  data  is  also  presented  in  the 
same  figure.  Clearly,  the  electron  injection  into  the  gate  oxide 
is  also  a  weak  function  of  the  Tjj.  Besides,  the  Iq/Id  values 
for  the  SOI  devices  are  a  little  higher  than  the  bulk  devices 
probably  because  the  vertical  field  of  gate  oxide  on  the  SOI 
devices  is  higher  than  on  the  bulk  devices. 
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Fig.3  Distribution  of  IsUB^^D  °f 

MOS  devices.  Both  and  Iq  were  measured  at  the 
maximum  point  with  Vq=4  V.  The  IsuB^D  sensitivity 
to  SOI  Tjj  is  very  weak.  The  variations  of  I^UB^^D  ^eross 
both  wafers  are  about  ^10%.  SOI  devices  have  lower 
^ence  by  a  factor  of  4  than  the  comparable 
bulk  MOSFETs. 

The  P-MOS  data  is  shown  in  Fig.4.  It  can  be  seen  that 
the  measured  IguB^D  ^  function  of  Tjj.  The 

IsUB^D  values  for  the  SOI  P-MOS  devices  are  lower  than 
that  in  the  bulk  devices  even  though  the  SOI  devices  have  a 


Fig.4  Distribution  of  IsuB^D  °f  (°) 

(b)  bulk  P-MOS  devices.  I^yg,  1q  and  Iq  were  measured  at 
the  maximum  I^yg  point  with  Vpy=-5  V.  The  IsuB^h 
^c/^D  sensitivity  to  SOI  T^,-  is  very  weak.  SOI  devices  have 
lower  IsuB^^D  hence  than  the  bulk  MOSFETs. 

Fig.5  demonstrates  the  combined  effects  of  2D  and  lateral 
doping  gradient  in  drain  region.  The  channel  field  increases 
exponentially  due  to  the  2D  effect,  while  decreases  near  the 
drain  region  due  to  the  lateral  doping  effect  Both  the  lower  Ej^ 
and  weaker  dependence  on  Tjj  in  thin  film  SOI  devices  can  be 
attributed  to  the  lateral  drain  doping  gradient  [13].  Simulations 
found  relatively  low  and  weak  sensitivity  on  Tjj  within 
the  range  of  interest  (500-1100  A).  E^,  is  a  function  of  the 
lateral  drain  doping  gradient  even  for  non-LDD  As  drains.  In 
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the  case  of  bulk  MOSFETs,  the  doping  gradient  varies  with 
this  contributes  to  the  dependence  on  Xj.  In  the  case  of  SOI 
MOSFETs,  lateral  doping  gradient  is  decoupled  from  Xj  (or 
f  .).  Therefore,  can  be  lower  in  an  SOI  device  than  a  bulk 
(fcvice  with  small  Xj=T5i  and  a  weak  dependence  on  Tjj  as 

well. 
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Fig.S  Demonstration  of  the  2D  and  lateral  drain  doping 
gradient  effects  on  channel  electric  distributions  in  a  MOS 
device.  The  lateral  doping  gradient  L  is  defined  as 


'  1  dNp' 


at  around  Njf=]-5xl0^^  cm~^. 


(C)  Hot-Carrier  Effects 

IguB  correlated  to  each  other  by  a  power-law 

relationship,  because  both  of  them  are  exponential  functions  of 
in  equations  3,  4  and  7  [14],[15].  Fig.6  demonstrates  the 
relationship  between  IguB  N-MOS  devices. 

Based  on  equation  7,  to  the  first  order,  the  logClQ/Io)  versus 
•ogdsuB^D^  plot  is  a  function  of  only  oxide  field,  EQx=fVG' 
^d)^ox’  its  slope  is  equal  to  tph  /(BjX).  This  slope  is  2.1 

suggesting  that  BjX,  i.e.,  the  critical  electron  energy  for  impact 
ionization,  is  about  1.5  eV  if  we  assume  the  Si/Si02  barrier 
height  for  electrons  is  (Pb=3.1  eV.  This  result  agrees  very  well 
'*'ith  the  bulk  case  [15].  The  agreement  between  theoretical 
prediction  and  experimental  data  suggests  that  Iq  can  be  used 
“  a  monitor  of  E^,  for  SOI  N-MOS  devices  with  the  body 
contacts,  c.g.,  l  can  be  estimated  from  the  slope  of  the 
*°?(Ig/Ij3)  versus  1/(Vd-V£)sat)  P*°^- 


Fig.6  Correlation  between  Igi/g  ond  Iq  in  SO!  N-MOS 
devices  The  slope  is  about  2.1.  Based  on  the  expression 

(4j.  the  slope  approximately  equals  to  (p|j/(B,X)  from 

which  B,X  can  be  calculated. 

Pre  vious  reports  are  divided  on  whether  FD  SOI  devices  are 
less  I3]-[7]  or  more  [  1  ].  [2]  vulnerable  to  hot-carrier  effects.  The 
fact  that  FD  SOI  devices  often  have  a  lower  channel  field 
than  bulk  devices  due  to  a  more  graded  drain  doping  profile 
should  also  be  reflected  from  the  device  degradation  in  terms  of 
hot-camcr  stress.  In  addition,  Fig.7  shows  the  saturation  drain 
current  shifts  for  a  given  maximum  Igyg  is  not  larger  in  SOI 
than  bulk  N-MOS  devices.  Fig.8  shows  the  extrapolated  lifetime 
for  both  SOI  N-  and  P-MOS  devices,  respectively,  as  compared 
to  the  bulk  devices.  The  devices  were  stressed  under  the  worse- 
case  suess  conditions  (maximum  Igus  for  N-MOS  case  and 
maximum  Iq  for  P-MOS  case).  Although  Xj  for  the  bulk 
devices  is  as  large  as  2000  A,  the  SOI  devices  with  T5i=8(X)  A 
are  still  slightly  less  vulnerable  than  the  bulk  devices  to  hot- 
carrier  degradation  at  least  in  this  case  study. 


Fig.7  Comparison  of  the  saturation  drain  current  shifts  for 
a  specific  maximum  Igug  stress  between  the  SOI  and  bulk 
N-MOS  devices. 
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Fig. 8  Device  hot-carrier  lifetime  of  the  SOI  and  bulk 
devices  as  a  function  of  (a),  substrate  current  in  N-MOS 
devices:  and.  (b)  gate  current  in  P-MOS  devices.  The 
lifetime  T  is  defined  as  the  stress  time  to  reach  109c 

IV.  CONCLUSIONS 

In  conclusion,  channel  field  in  SOI  MOSFETs  is  lower 
than  previously  assumed.  Hot-carrier  degradation  of  thin  film 
FD  SOI  devices  can  be  less  severe  than  a  similar  bulk 
MOSFET.  The  decoupling  of  SOI  MOSFET  junction  depth 
(Tjj)  and  lateral  doping  gradient  is  little  discussed  but  of 
significant  advantage  in  drain  engineering.  This  realization 
improves  the  prospects  of  thin  film  SOI  devices  meeting 
breakdown  voltage  and  hot-carrier  effects  requirements.  The 
correlation  between  I5|j5  and  Iq  is  confirmed  and  suggests  that 
the  channel  field  may  be  characterized  through  the  measurable 
Iq.  Iq  measurement  should  help  drain  structure  design 
engineering. 
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Abstract 

While  recognizer  technologies  are  being  applied  to  increasingly  chal¬ 
lenging  problems,  it  is  still  true  that  the  robustness  of  these  systems  to 
acoustic  variability  (such  as  room  acoustics,  background  noise,  and  mod¬ 
ified  microphone  or  channel  characteristics)  still  is  far  poorer  than  that 
achieved  by  human  listeners.  A  wide  variety  of  techniques  have  been  de¬ 
veloped  by  researchers  to  attempt  to  deal  with  these  problems.  This  paper 
will  describe  the  current  state  of  our  approach  to  this  problem  at  Berke¬ 
ley,  which  has  focused  on  the  modeling  of  some  gross  temporal  properties 
of  human  hearing. 

1  INTRODUCTION 

Speech  processing,  and  in  particular  speech  recognition  by  machine,  is  severely  hzimpered  by 
problems  such  a^  the  effect  of  room  acoustics,  background  noise,  and  mismatches  between 
recording  characteristics  for  the  systems  used  for  training  and  recognition  (for  instance, 
different  handsets  for  the  telephone). 

Human  beings  are  also  hampered  in  their  speech  understanding  by  these  factors,  but  we 
appear  to  do  much  better.  Certainly  much  of  this  robustness  is  due  to  our  ability  to  predict 
and  interpret  speech  on  the  bcisis  of  our  understanding  of  the  language  and  our  knowledge 
of  the  world  and  how  it  works.  However,  even  the  recognition  of  digits  in  the  presence  of 
strong  additive  noise  is  quite  a  bit  better  for  people  than  for  our  best  recognizers.  This  is 
a  task  that  uses  relatively  little  knowledge  about  language  other  than  knowing  the  digits 
themselves,  since  in  many  applications  any  digit  can  follow  any  other.  Therefore,  we  think 
that  it  is  reasonable  to  try  to  leajn  acoustical  processing  approaches  from  human  solutions 
to  these  problems  in  order  to  guide  our  study  of  the  design  of  robust  machine  recognizers. 

The  oldest  engineering  methods  for  improving  speech  processing  in  the  presence  of  noise 
and  spectral  coloration  are  based  on  Wiener  filtering.  The  basic  idea  of  this  approach  is  to 
estimate  the  noise  spectrum  and  the  speech  spectrum  and  find  a  good  compromise  between 
damaging  the  speech  and  getting  rid  of  the  noise.  For  instance,  in  a  simple  case  in  which 
the  noise  and  speech  were  in  different  parts  of  the  audio  spectrum,  you  could  simply  filter 
out  the  noise.  In  most  practical  cases,  the  problem  is  not  this  easy.  However,  if  the  noise 
doesn’t  change  too  much  over  time,  is  uncorrelated  with  the  speech,  and  its  spectrum 
can  be  estimated  accurately,  one  can  also  subtract  out  much  of  its  effects.  In  the  case 
of  additive  noise,  this  approach  is  commonly  called  spectral  subtraction  [1].  In  the  case 
of  linear  filtering  effects  (for  instance  the  effect  that  results  from  using  for  training  and 
testing  telephone  channels  that  have  differing  spectral  cheu-acteristics),  this  subtraction  is 
done  in  the  logarithmic  spectral  domain  eind  is  closely  related  to  what  has  been  called 


blind  deconvolution.  This  was  named  for  the  process  used  to  remove  much  of  the  “tinny” 
audio  characteristics  on  old  Caruso  recordings  by  normalizing  out  the  linear  effects  of  the 
mechanical  gramaphone  recording  horn  [8].  In  more  recent  years  similar  techniques  have 
been  used  to  normalize  out  the  effect  of  the  frequency  responses  of  different  microphones: 
this  is  commonly  done  in  the  cepstral  domain,  which  is  a  Fourier  decomposition  of  the  log 
spectrum.  The  technique  is  commonly  called  mean  cepstral  subtraction,  since  it  basically 
consists  of  subtracting  the  average  cepstrum  (or  equivalently  the  average  log  spectrum) 
from  the  speech. 

These  methods  have  helped  to  some  extent,  but  it  is  still  true  that  adding  even  a  mod¬ 
erate  amount  of  realistic  noise  has  a  strongly  negative  effect  on  speech  recognition  systems. 
For  this  reason  we  have  focused  on  developing  algorithms  that  have  better  robustness  to 
unpredictable  acoustic  test  conditions.  We  have  tended  to  use  methods  inspired  by  human 
hearing,  although  we  depart  from  the  biological  example  whenever  it  seems  to  make  sense. 
This  could  be  likened  to  the  development  of  airplanes  that  utilize  Bernoulli’s  principle, 
which  correctly  describes  the  physics  that  permits  birds  to  fly.  without  blindly  following 
the  exact  design  and  trying  to  build  747’s  that  flap  their  wings.  In  our  case,  much  of  our 
work  has  been  based  on  the  observation  that  the  human  nervous  system  is  most  responsive 
to  novelty,  and  in  general  that  there  is  a  certain  range  of  speeds  of  sensory  phenomena 
that  we  are  most  sensitive  to.  This  is  a  fundamenttil  fact  about  our  sensory  systems,  and 
human  speech  and  language  developed  in  this  context;  as  a  species  we  developed  speech  in 
a  form  that  we  could  hear  well.  This  suggests  that  our  machine  speech  analysis  systems 
might  benefit  from  these  timing-based  factors.  Over  the  last  few  years,  we  have  performed 
a  number  of  experiments  that  suggest  that  this  is  a  worthwhile  strategy.  The  rest  of  this 
paper  will  briefly  describe  an  approach  we  have  developed  to  take  advantage  of  one  of  these 
properties  of  human  hearing,  and  an  experiment  that  indicates  the  utility  of  the  approach 
on  an  isolated  digits  task.  The  paper  will  conclude  with  a  discussion  of  the  future  directions 
in  our  speech  robustness  work  at  ICSI  and  UC  Berkeley  (in  collaboration  with  the  Oregon 
Graduate  Institute). 

2  RASTA  PROCESSING 

As  mentioned  above,  natural  sensory  systems  tend  to  respond  more  strongly  to  novel 
stimuli  rather  than  to  continuations  of  the  same  old  thing.  This  strategy  has  some  obvious 
consequences  for  survival  of  the  organism  (you  would  really  want  to  know  about  the  rustle  in 
the  bushes,  as  it  might  be  due  to  something  that  wants  to  eat  you).  Additionally,  however, 
sensitivity  to  novelty  tends  to  have  a  normalizing  property.  If  you  are  most  interested  in 
change,  then  a  constant  background  (for  instance,  of  illumination,  color,  or  acoustic  noise) 
will  have  less  of  an  effect  on  perception.  This  is  desirable,  since  the  message  is  often  the 
same  regardless  of  the  background  conditions. 

Many  experiments  have  been  done  with  human  listeners  to  determine  the  sensitivity 
to  different  aspects  of  speech.  In  one  such  experiment,  for  instance,  Summerfield  and  col¬ 
leagues  showed  that  the  perception  of  speech-like  sounds  depends  on  the  spectral  difference 


between  the  current  and  preceding  sounds  [9].  Other  experiments  (for  instance  those  re¬ 
ported  in  [4])  showed  that  human  listeners  were  relatively  insensitive  to  slowly  varying 
sounds.  This  may  partially  explain  why  human  listeners  do  not  seem  to  mind  a  slow 
change  in  the  frequency  characteristics  of  the  communication  environment,  or  why  steady 
background  noise  often  does  not  severely  impair  human  speech  communication. 

Thus,  to  make  speech  axialysis  less  sensitive  to  slowly  changing  or  steady-state  factors  in 
speech,  we  developed  a  method  based  on  a  simple  b^mdpass  or  highpaiss  filtering  of  the  time 
trajectories  of  each  spectral  channel.  This  is  equivtdent  to  a  spectrum  based  on  change, 
and  so  we  called  it  the  RelAtive  SpecTrAl  or  RASTA  method  [3].  Initially  our  experiments 
were  only  done  by  filtering  in  the  log  spectral  domain,  which  is  the  optimal  function  to 
filter  for  the  case  in  which  some  relatively  const2Lnt  lineau’  filtering  had  been  done  on  the 
speech  (for  instance,  to  represent  the  frequency  response  of  the  microphone  or  hand  set). 
These  experiments  showed  a  greatly  increased  robustness  to  strong  channel  variation,  and 
variants  of  the  methods  have  been  used  by  many  laboratories  worldwide.  However,  the 
method  initially  showed  no  increased  robustness  to  additive  noise. 

More  recently,  we  developed  an  extension  to  the  original  RASTA  cailled  J-RASTA  (or 
sometimes  lin-log  RASTA)  [6,  5].  In  this  method,  the  domain  in  which  the  speech  spectrum 
was  filtered  or  “relative”  was  a  function  of  the  noisiness  of  the  data.  For  very  noisy  data, 
the  filtering  was  done  on  something  very  close  to  the  power  spectrum,  and  for  clean  data 
the  filtering  was  done  on  something  closer  to  the  log  power  spectrum.  Typically  we  used 
a  single  family  of  functions  that  was  pcirameterized  by  a  “J”  variable  that  wcis  inversely 
proportional  to  the  estimate  of  noise  power  derived  from  the  local  statistics  [2]. 

This  technique  has  recently  been  implemented  in  public  domain  software,  and  was  shown 
to  be  effective  against  both  linear  filtering  and  additive  noise.  Table  I  shows  some  results 
on  an  isolated  digits  experiment.  For  this  experiment,  200  speakers  were  used;  in  each  of 
4  tests,  150  speakers  were  used  for  training  and  50  for  test.  By  rotating  (or  “jackknifing”) 
which  speakers  were  the  training  or  test  portion,  all  of  the  data  could  be  fairly  used  for 
testing.  The  HMM  Tool  Kit  (HTK)  from  Cambridge  [10]  w«is  used  to  design  and  trcdn  a 
Gaussian-mixture  based  speech  recognizer. 

Table  I  shows  that  the  RASTA  approach,  when  applied  in  the  J-RASTA  formulation, 
appears  to  provide  some  robustness  to  the  effects  of  additive  noise  eind  linear  filtering.  In 
particular,  the  error  rate  with  added  noise  was  one-third  of  that  seen  without  the  RASTA 
processing.  However,  adding  noise  still  quadrupled  the  error  rate,  even  at  a  10  dB  SNR.  At 
this  signal-to-noise  ratio  most  humans  would  have  very  little  increase  in  error  on  this  task. 
Therefore  it  is  fair  to  say  that  more  work  needs  to  be  done,  aind  perhaps  incorporating 
some  other  simple  characteristics  of  human  hearing  may  be  helpful. 

A  number  of  researchers,  including  some  at  Berkeley,  are  working  to  employ  better 
auditory  models  at  the  front  end  of  speech  recognition  systems.  This  may  in  fact  prove 
important.  However,  for  the  most  part  we  axe  focusing  on  the  importance  of  changing  the 
overall  statistical  system  in  order  to  accommodate  such  representations.  This  is  briefly 
described  next. 


clean 

noise 

clean-filtered 

noise-filtered 

PLP 

5.0 

37.0 

24.9 

50.4 

RASTA 

3.3 

50.0 

3.6 

40.4 

J-RASTA 

3.7 

13.7 

5.6 

17.1 

Table  1:  Isolated  digit  (plus  the  words  "yes"  and  "no" )  error  rates  in 
percent,  using  HTK-based  Gaussian  mixture  system.  PLP  was  the 
basic  speech  feature  set  used,  which  was  modified  for  robustness  in 
the  other  analysis  methods.  For  all  cases,  the  recognizer  was  trained 
on  clean  speech.  Noise  indicates  test  data  with  SNR=10dB,  and 
noise-filtered  indicates  test  data  with  the  SNR=10dB  and  with  an 
additional  linear  distortion  introduced  by  filtering.  Note  that  all  of 
the  error  rates  are  somewhat  high  for  an  isolated  digits  recognizer; 
one  of  our  recognizers  with  more  features  and  a  more  advanced 
statistical  model  has  roughly  one-third  the  error  rate,  but  currently 
takes  much  longer  to  train  and  so  was  not  used  for  this  study. 


3  PERCEPTUALLY-BASED  STATISTICAL  MOD¬ 
ELS 

Traditioneilly,  pattern  recognition  systems  such  as  speech  or  character  recognizers  have 
been  divided  into  two  major  components:  feature  extraction  (including  all  kinds  of  pre¬ 
processing)  and  pattern  matching.  However,  it  is  well  known  that  the  two  components  are 
strongly  interdependent.  For  instance,  for  the  case  of  good  acoustic  conditions  (matching 
in  training  and  test)  we  have  sometimes  observed  that  recognition  performance  for  simple 
context-independent  subword-unit  recognizers  cam  be  degraded  by  RASTA.  However,  we 
have  seen  that  RASTA  has  worked  well  when  there  is  some  explicit  modeling  of  the  effects 
of  context  from  the  past.  This  is  so  because  RASTA  processing  increases  the  dependence 
on  this  context  because  of  its  filtering  process. 

Similarly,  incorporating  more  advanced  auditory  models  may  not  be  fruitful  without  a 
similarly  developed  statistical  pattern  classification  system  that  can  take  advantage  of  the 
auditory  features.  For  this  reason,  we  are  now  working  on  a  statistical  model  of  speech  that 
is  intrinsically  perceptual.  As  with  the  human  perceptual  system,  our  new  model  (called 
the  Stochastic  Perceptual  Auditory-event-based  Model,  or  SPAM)  focuses  on  novel  events, 
such  as  major  changes  in  the  speech  spectrum.  As  of  this  writing  we  have  developed  the 
basic  mathematical  theory  [7],  but  we  have  not  yet  developed  a  practical  system.  The 
major  idea  of  this  development  is  that  we  model  speech  as  a  succession  of  auditory  events, 
or  detected  novelties,  which  the  perceptual  system  must  disambiguate.  These  events  are 
connected  by  periods  in  which  the  acoustics  don’t  change  much.  The  major  difference  from 
classical  statistical  speech  models  is  that  the  system  is  not  trained  to  discriminate  between 
fine  differences  in  steady-state  regions,  but  rather  to  discriminate  between  regions  of  strong 


transition  that  correspond  to  these  auditory  events.  For  instance,  in  the  syllable  “ma”,  the 
transition  between  the  “m”  and  the  “a”  sound  would  receive  more  emphasis  in  the  training 
of  the  statistical  system  than  the  more  constant  portions  of  the  two  sounds.  This  is  very 
different  from  the  modeling  done  in  current  recognizers,  in  which  more  emphaisis  is  placed 
on  the  longer  and  more  constant  middle  parts  of  the  two  sounds.  VVe  believe  that  this  may 
be  a  bicLS  of  human  speech  perception  as  well,  and  that  it  should  be  a  better  match  to 
auditory  features  that  have  the  potential  to  improve  acoustic2d  robustness  for  the  overall 
system. 

Figure  1  shows  the  general  structure  of  speech  recognition  in  terms  of  some  of  the 
acoustical  sources  of  error  and  the  broad  kinds  of  solutions  that  are  being  developed. 


4  Conclusions 

As  human  listeners,  we  tend  to  teike  for  granted  how  well  we  can  understtind  speech  in 
the  presence  of  strong  acoustic  interference.  As  speech  system  designers,  however,  we  soon 
leaxn  that  “adverse  conditions”  for  our  recognizers  correspond  to  “normal  conditions”  for 
people. 

So  far,  our  algorithms  have  taJcen  advantage  of  an  extremely  simple  property  of  human 
hearing  to  improve  performance  in  the  presence  of  noise  and  linear  filtering.  In  the  study 
reported  here,  we  cut  the  error  for  a  moderately  noisy  case  by  roughly  a  factor  of  three 
using  these  methods.  However,  the  resulting  error  rate  for  the  noisy  case  Wcis  still  four 
times  larger  than  it  had  been  without  added  noise.  We  continue  to  derive  inspiration 
from  human  perception  in  the  development  of  speech  recognition  theory  and  systems.  In 
particular,  we  are  now  working  on  the  development  of  a  statistical  system  that  will  focus 
modeling  power  on  those  acoustical  segments  that  <ire  critictil  to  speech  perception.  If  this 
research  direction  (or  some  other  approach  to  acousticad  robustness)  proves  successful,  we 
can  expect  significant  improvements  in  performance  of  commercial  systems  in  future  years, 
since  practical  systems  operate  in  less  controlled  acoustic  conditions  than  ^lre  found  in  the 
laboratory. 
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Figure  1:  A  simple  block  diagram  of  the  speech  recognition  process.  The  blocks  above  the 
wavy  line  illustrate  the  steps  in  generating  the  signal  that  is  received  at  the  speech  recognition 
system.  This  breakdown  is  chosen  to  show  some  of  the  sources  of  degradation  in  the  received 
speech  signal.  The  blocks  below  the  wavy  line  are  the  major  recognizer  components,  chosen 
to  show  the  different  pieces  of  the  technology  that  can  potentially  be  used  to  correct  for  the 
nonideal  conditions  that  are  created  by  realistic  acoustical  situations.  This  paper  primarily  refers 
to  the  feature  extraction  block,  although  we  mention  the  statistical  recognition  block  to  some 
extent. 
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Abstract 


The  most  commonly  used  neural  network  models  are  not  well  suited 
to  direct  digital  implementations  because  each  node  needs  to  per¬ 
form  a  large  number  of  operations  between  floating  point  values. 
Fortunately,  the  ability  to  learn  from  examples  and  to  generalize  is 
not  restricted  to  networks  of  this  type.  Indeed,  networks  where  each 
node  implements  a  simple  Boolean  function  ( Boolean  networks)  can 
be  designed  in  such  a  way  as  to  exhibit  similar  properties.  Two 
algorithms  that  generate  Boolean  networks  from  examples  are  pre¬ 
sented.  The  results  show  that  these  algorithms  generalize  very 
well  in  a  class  of  problems  that  accept  compact  Boolean  network 
descriptions.  The  techniques  described  are  general  and  can  be  ap¬ 
plied  to  tasks  that  are  not  known  to  have  that  characteristic.  Two 
examples  of  applications  are  presented:  image  reconstruction  and 
hand-written  character  recognition. 


1  Introduction 

Tlie  main  objective  of  this  research  is  the  design  of  algorithms  for  empirical  learning 
that  generate  networks  suitable  for  digital  implementations.  Although  threshold 
gate  networks  can  be  implemented  using  standard  digital  technologies,  for  many 
applications  this  approach  is  expensive  and  inefficient.  Pulse  stream  modulation 
[Murray  and  Smith,  1988]  is  one  possible  approach,  but  is  limited  to  a  relatively 
small  number  of  neurons  and  becomes  slow  if  high  precision  is  required.  Dedicated 


boards  based  on  DSP  processors  can  achieve  very  high  performance  and  are  very 
flexible  but  may  be  too  expensive  for  some  applications. 

The  algorithms  described  in  this  paper  accept  cis  input  a  training  set  and  generate 
networks  where  each  node  implements  a  relatively  simple  Boolean  function.  Such 
networks  will  be  called  Boolean  networks.  Many  applications  can  benefit  from 
such  an  approach  because  the  speed  and  compactness  of  digital  implementations 
is  still  unmatched  by  its  analog  counterparts.  Additionally,  many  alternatives  are 
available  to  designers  that  want  to  implement  Boolean  networks,  from  full-custom 
design  to  field  programmable  gate  arrays.  This  makes  the  digital  alternative  more 
cost  effective  than  solutions  based  on  analog  designs 

Occam’s  razor  [Blumer  ei  al.,  1987;  Rissanen,  1986]  provides  the  theoretical  founda¬ 
tion  for  the  development  of  algorithms  that  can  be  used  to  obtain  Boolean  networks 
that  generalize  well.  According  to  this  paradigm,  simpler  explanations  for  the  avail¬ 
able  data  have  higher  predictive  power  The  induction  problem  can  therefore  be 
posed  ,i.s  an  optimization  problem:  given  i\  Inheled  trniiiing  set.  derive  the 
less  complex  Duuleaii  network  that  is  consistent'  with  the  training  set. 

Occam's  razor,  however,  doesn't  help  in  the  choice  of  the  particular  way  of  mea¬ 
suring  complexity  that  should  be  used  In  general,  different  types  of  problems  may 
require  different  complexity  measures.  The  algorithms  described  in  section  3.1  and 
3.2  are  greedy  algorithms  that  aim  at  minimizing  one  specific  complexity  measure: 
the  size  of  the  overall  network.  Although  this  particular  way  of  measuring  com¬ 
plexity  may  prove  inappropriate  in  some  cases,  we  believe  the  approach  proposed 
can  be  generalized  and  used  with  minor  modifications  in  many  other  tasks.  The 
problem  of  finding  the  smallest  Boolean  network  consistent  with  the  training  set  is 
NP-hard  [Garey  and  Johnson,  1979]  and  cannot  be  solved  exactly  in  most  cases. 
Heuristic  approaches  like  the  ones  described  au-e  therefore  required. 


2  Definitions 

We  consider  the  problem  of  supervised  learning  in  an  attribute  based  description 
language.  The  attributes  (input  variables)  are  assumed  to  be  Boolean  and  every 
exemplar  in  the  training  set  is  labeled  with  a  value  that  describes  its  class.  Both 
algorithms  try  to  maximize  the  mutual  information  between  the  network  output 
and  these  labels. 

f,et  variable  A’  take  the  values  {j'l .  j-2,  with  probabilitie>  /)(z-i  ).p(r2)...p(xn). 
I'lie  l■llIlopy  of  .V  is  given  by  //(A')  =  and  i.-  .a  measure 

of  the  niiceiiiiiiity  aboui  the  value  of  .\ .  1  he  uncertaiiitj  about  the  value 

of  A'  when  the  value  of  another  variable  V  is  known  is  given  by  //{A’|V)  = 

-  E,  P(.'A  )  Ej  Pi^'j  ll/i)  •ogP(^j  If/.  )- 

The  amount  by  which  the  uncerlainty  of  A'  is  reduced  when  the  value  of  varialile  Y 
is  known,  I{Y,X)  =  W(A')  —  //(A'|T)  is  called  the  mutual  information  between  )’ 
and  A'.  In  this  context,  Y  will  be  a  variable  defined  by  the  output  of  one  or  more 
nodes  in  the  network  and  X  will  be  the  target  value  specified  in  the  training  set. 


Up  to  some  specified  level. 


3  Algorithms 

3.1  Muesli  -  An  algorithm  for  the  design  of  multi-level  logic  networks 

This  algorithm  derives  the  Boolean  network  by  performing  gradient  descent  in  the 
mutual  information  between  a  set  of  nodes  and  the  target  values  specified  by  the 
labels  in  the  training  set. 

In  the  pseudo  code  description  of  the  algorithm  given  in  figure  1,  the  function  I(S) 
computes  the  mutual  information  between  the  nodes  in  5  (viewed  as  a  multi-valued 
variable)  and  the  target  output. 


muesli{nltst)  { 

nlist  *—  soTt.nli3tJ>yJ{nlnl.l): 
sup  —  2; 

while  {nnl.ilonr\nlisl)  sup  >  iiiiii  -upi  { 

lift  —  (I; 
do  { 

acl  +  +: 

success  —  improve .mtiacl.nlist.  sup} 

}  while  {success  =  FALSE  A  act  <  mai^cl}, 
if  (success  =  TRUE)  { 
sup  —  2; 

while  (aucces*  =  TRUE) 

success  •—  improve. mt{aci,  niist,  sup): 

} 

else  sup  +  -f ; 

} 

} 

improve.mi(act,nlisl,  sup)  j 

nlist  <—  sort.nlist.byj[(nlist,act): 
f  —  be3i.function{nltsl,aci,sup): 
if  (r(nlist[l:act-l)  U  f)  >  I(nl.sl[l:ocl]))  { 
nlist  s—  nlist  U  /; 
return(TRUE); 

} 

else  return(FALSE); 

} 


Figun'  1:  I’s('iido-code  for  llie  Miitsli  algoriiliin 


Tlie  algoiitlim  woi  k>  by  keeping  a  list  of  candidate  iiotle>.  ?;//>/.  that  initially  con¬ 
tains  only  the  primary  inputs.  The  net  variable  selects  which  node  in  iilinl  is  active. 
Initially,  act  is  set  to  1  and  the  node  that  provides  more  information  about  the  out¬ 
put  is  selected  as  the  active  node.  Function  improve.mi()  tries  to  combine  the 
active  node  with  other  nodes  as  to  increa.se  the  mutual  information. 

Except  for  very  simple  functions,  a  point  will  be  reached  where  no  further  improve- 


ments  can  be  made  for  the  single  most  informative  node.  The  value  of  act  is  then 
increased  (up  to  a  pre-specified  maximum)  and  improve. mi  is  again  called  to  select 
auxiliary  features  using  other  nodes  in  nlist  as  the  active  node.  If  this  fails,  the 
value  of  sup  (size  of  the  support  of  each  selected  function)  is  increased  until  no 
further  improvements  are  possible  or  the  target  is  reached. 

The  function  sort. nlist. by JL(nlist,  act)  sorts  the  first  act  nodes  in  the  list  by  de¬ 
creasing  value  of  the  information  they  provide  about  the  labels.  More  explicitly,  the 
first  node  in  the  sorted  list  is  the  one  that  provides  maximal  information  about  the 
labels.  The  second  node  is  the  one  that  will  provide  more  additional  information 
after  the  first  has  been  selected  and  so  on. 

Function  improve.mi()  calls  best.function{nlist,act,sup)  to  select  the  Boolean 
function  /  that  takes  as  inputs  node  n/tst[act]  plus  sup-1  other  nodes  and  maximizes 
I(nlist[l  :  act  —  1]  U/).  When  sup  is  larger  than  2  it  is  unfeasible  to  search  all  2^ 
possible  functions  to  select  the  desired  one.  However,  given  sup  input  variables, 
finding  such  a  function  is  equivalent  to  selecting  a  parlilioir  of  the  2’“'’  points  in 
the  input  space  that  maximizes  a  specific  cost  function  'fins  partition  is  found  using 
the  Kernighan-Lin  algorithm  [Kerniglian  and  Liii  I'.iTOj  for  grapii-partitioiiing. 


Figure  2  exemplifies  how  the  algorithm  works  when  learning  the  simple  Boolean 
function  f  =  ab  +  ede  from  a  complete  training  set.  In  this  example,  the  value  of 
sup  is  always  at  2.  Therefore,  only  2  input  Boolean  functions  are  generated. 

Select  s  «  ab 

Faik  to  find  fix.?)  «ndi  mi([fl)  >  0.S2 
Sctaci-2. 

miC  1)  =  0.0  » 

■^=CH 

c 

oUst  s  [t.bA(l.e| 
acts  1 

nii((t])  =  0.16 

alisi  >  [xAd.cAbJ 
act  a  t 

tni((xl)»0J2 

alig»(iAd<j.h| 

Kt>  2 

au(|xxl)  ■  063 

Select  y  =  cd 

Select  w  s  ye 

Fails  ID  find  f(w,?)  with  ini([x,f1)  >  0.93 

Set  act «  0;  Select  x  ■  x-fw 

a 

^  c 

nlist  =  fx.y.e,&.b.c.d} 
act  =  2 

nii([x,y])  =  0.74 

nlist  =  [x.y,e^,bjc,d) 
act  =  2 

Tni([x,w])  =  0.93 

nitst  a  |z.x.y.a,b.c.d.cj 
act  =  1 

jni(lzl)  =  0.93 

Figure  2:  The  muesli  algorithm,  illustrated 


single  output  Boolean  function  is  equivalent  to  a  partition  of  the  input  space  in  two 

sets. 


3.2  Fulfringe  -  a  network  generation  algorithm  based  on  decision  trees 


This  algorithm  uses  binary  decision  trees  [Quinlan.  1986]  as  the  basic  underlying 
representation.  A  binary  decision  tree  is  a  rooted,  directed,  acyclic  graph,  where 
each  terminal  node  (a  node  with  no  outgoing  edges)  is  labeled  with  one  of  the 
possible  output  labels  and  each  non-terminal  node  ha.s  exactly  two  outgoing  edges 
labeled  0  and  1.  Each  non-terminal  node  is  also  labeled  with  the  name  of  the 
attribute  that  is  tested  at  that  node.  A  decision  tree  can  be  used  to  classify  a 
particular  example  by  starting  at  the  root  node  and  taking,  until  a  terminal  is 
reached,  the  edge  labeled  with  the  value  of  the  attribute  tested  at  the  current  node. 


Decision  trees  are  usually  built  in  a  greedy  way.  At  each  step,  the  algorithm  greedily 
selects  the  attribute  to  be  tested  as  the  one  that  provides  maximal  information  about 
the  label  of  the  examples  that  reached  that  node  in  the  decision  tree  It  then  recurs 
after  splitting  these  examples  according  to  the  value  of  the  tested  attribute. 


Fiilfrn\(}(  works  by  idenl  ifv iiig  patl'-rii'  ii'-.ir  lb*  frine*--  of  t  li*-  d<'ciMoii  tree  and 
using  thorn  to  build  now  foaluro-  I  h*  idi-.i  «;»'  liisi  |>ro|>os*  d  ni  Jhagallo  and 
Han.sslor,  l!)!)(j]. 


Figure  Fringe  patterns  identilied  by  fuljrtugi 

Figure  3  shows  the  patterns  that  fttifrnigt  idenl  dies.  Dcfniigi.  proposed  in  [\aiig 
(t  al..  1991],  identifies  the  patterns  shown  in  the  first  two  rows.  These  patterns 
correspond  to  8  Boolean  functions  of  2  variables.  Since  there  are  only  10  distinct 
Boolean  functions  that  depend  on  two  variables^,  it  is  natural  to  add  the  patterns 
in  the  third  row  and  identify  all  possible  functions  of  2  variables.  As  in  dcfnnge 
and  fringe,  these  new  composite  features  arc  added  (if  they  have  not  yet  been 
generated)  to  the  list  of  available  features  and  a  new  decision  tree  is  built.  The 


^The  remaining  6  functions  of  2  variables  depend  on  only  one  or  none  of  the  variables. 


process  is  iterated  until  a  decision  tree  with  only  one  decision  node  is  built.  The 
attribute  tested  at  this  node  is  a  complex  feature  and  can  be  viewed  as  the  output 
of  a  Boolean  network  that  matches  the  training  set  data. 

3.3  Encoding  multivalued  outputs 

Both  muesli  and  fulfnnge  generate  Boolean  networks  with  a  single  binary  valued 
output.  When  the  target  label  can  have  more  than  2  values,  some  encoding  must  be 
used.  The  prefered  solution  is  to  encode  the  outputs  using  an  error  correcting  code 
[Dietterich  and  Bakiri,  1991].  This  approach  preserves  most  of  the  compactness  of 
a  digital  encoding  while  beeing  much  less  sensitive  to  errors  in  one  of  the  output 
variables.  Additionally,  the  Hamming  distance  between  an  observed  output  and  the 
closest  valid  codeword  gives  a  measure  of  the  certainty  of  the  classification.  This 
can  be  used  to  our  advantage  in  problems  where  a  failure  to  classify  is  less  serious 
than  the  output  of  a  wrong  classification 

4  Performance  evaluation 

To  evaluate  the  algorithms,  we  selected  a  s«‘t  of  1 1  functions  of  variable  complexity 
A  complete  description  of  these  functions  can  be  found  in  [Oliveira.  1994].  The  first 
6  functions  were  proposed  as  test  cases  in  [Pagallo  and  Haussler.  1990]  and  accept 
compact  disjoint  normal  form  descriptions.  The  remaining  ones  accept  compact 
multi-level  representations  but  have  large  two  level  descriptions  The  algorithms 
described  in  sections  3.1  and  3.2  were  compared  with  the  cascade-correlation  algo¬ 
rithm  [Fahlman  and  Lebiere,  1990]  and  a  standard  decision  tree  algorithm  analog 
to  1D3  [Quinlan,  1986].  As  in  [Pagallo  and  Haussler.  1990],  the  number  of  examples 
in  the  training  set  was  selected  to  be  equal  to  j  times  the  description  length  of  the 
function  under  a  fixed  encoding  scheme,  where  r  was  set  equal  to  0.1.  For  each 
function,  5  training  sets  were  randomly  selected.  The  average  accuracy  for  the  5 
runs  in  an  independent  set  of  4000  examples  is  listed  in  table  1. 


Table  1;  Accuracy  of  the  four  algorithms. 


1  Accuracy 

tnuesU 

1  dnfl 

80 

3292 

99.91 

99.98 

75.38 

40 

2185 

99.28 

98.89 

88.84 

73.11 

3.' 

1650 

99.94 

100.00 

79.19 

1  tiiiii 

1)4 

2640 

100. oil 

MMMHI 

72.61 

38. 11 

111 

1200 

98.  r. 

IIMMMI 

7  ,  .M  1 

99.91 

XOlj.JJ 

•  >  J 

4000 

60.16 

100.00 

51.41 

99.97 

-111  U 

12 

1.540 

99.81 

98. yK 

sm  1  .s 

18 

2720 

100.00 

HSl 

91,48 

91.30 

.-lii-S 

1.8 

2720 

100.00 

100  00 

94.55 

92.57 

str27 

27 

4160 

98.64 

99.35 

94.24 

carrvS 

16 

2017 

98.71 

99.22 

IPil!/4F.UJI 

99.71 

85.35 

87.45 

The  results  show  that  the  performance  of  muesli  and  fulfnnge  is  consistently  su- 


perior  to  the  other  two  algorithms.  Mueslt  performs  poorly  in  examples  that  have 
many  lov  functions,  due  the  greedy  nature  of  the  algorithm.  In  particular,  mueslt 
failed  to  find  a  solution  in  the  alloted  time  for  4  of  the  5  runs  of  zor5.32  and  found 
the  exact  solution  in  only  one  of  the  runs. 

ID3  was  the  fastest  of  the  algorithms  and  Cascade-Correlation  the  slowest.  Fulfnnge 
and  mueslt  exhibited  similar  running  times  for  these  tasks.  We  observed,  however, 
that  for  larger  problems  the  runtime  for  fulfrtnge  becomes  prohibitively  high  and 
mueslt  is  comparatively  much  faster. 

5  Applications 

To  evaluate  the  techniques  described  in  real  problems,  experiments  were  performed 
in  two  domains;  noisy  image  reconstruction  and  handwritten  character  recognition. 
The  main  objective  was  to  investigate  whether  the  approach  is  applicable  to  prob¬ 
lems  that  are  not  known  to  accept  a  compart  H<><'h  aii  network  rept>>entation.  I  he 
ontput.s  were  encoded  using  a  la  bit  liailaiiiard  erroi  correcting  code. 

5.1  Image  reconstruction 

The  speed  required  by  applications  in  image  processing  makes  it  a  very  interesting 
field  for  this  type  of  approach.  In  this  experiment.  16  level  gray  scale  images  were 
corrupted  by  random  noise  by  switching  each  bit  with  5%  probability.  Samples  of 
this  image  were  used  to  train  a  network  in  the  reconstruction  of  the  original  image. 
The  training  set  consisted  of  5x5  pixel  regions  of  corrupted  images  (100  binary 
variables  per  sample)  labeled  with  the  value  of  the  center  pixel  Figure  4  shows  a 
detail  of  the  reconstruction  performed  in  an  independent  test  image  by  the  network 
obtained  using  fulfringe. 


Figure  4:  Image  reconstruction  e.xperiment 


5.2  Handwritten  character  recognition 

The  NIST  database  of  handwritten  characters  was  u.sed  for  this  task.  Individually 
segmented  digits  were  normalized  to  a  16  by  16  binary  grid.  A  set  of  53629  digits 
was  used  for  training  and  the  resulting  network  was  tested  in  a  different  set  of  52467 


digits.  Training  was  performed  using  muesli.  The  algorithm  was  stopped  after  a  pre¬ 
specified  time  (48  hours  on  a  DECstation  5000/260)  ellapsed  The  resulting  network 
was  placed  and  routed  using  the  TimberWolf  (Sechen  and  Sangiovanni-V'incentelli, 
1986J  package  and  occupies  an  area  of  78.8  sq.  mm.  using  0.8/i  technology. 

The  accuracy  on  the  test  set  was  93.9%.  This  value  compares  well  with  the  per¬ 
formance  obtained  by  alternative  approaches  that  use  a  similarly  sized  training  set 
and  little  domain  knowledge,  but  falls  short  of  the  best  results  published  so  far. 
Ongoing  research  on  this  problem  is  concentrated  on  the  use  of  domain  knowledge 
to  restrict  the  search  for  compact  networks  and  speed  up  the  training. 
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Abstract 

Speech  recognizers  often  operate  in  an  adverse  environment:  acoustic  ambient  noise,  spectral  distortions, 
reverberation  and  other  environmental  factors  all  cause  severe  degradation  in  the  recognition  performance. 
In  this  report,  we  discuss  a  front-end  technique,  called  Jah-RASTA,  to  combat  noise  and  improve  the 
robustness  of  speech  recognizers.  Jah-RASTA  processing  uses  bandpass  filtering  of  temporal  trajectories 
of  non-linearly  transformed  critical  band  spectrum  to  simultaneously  reduce  additive  noise  and  spectral 
distortion.  However,  the  optimal  form  of  the  nonlinear  transform  used  by  Jah-RASTA  is  a  function  of  the 
noise  level  -  a  time  varying  quantity.  This  introduces  a  new  source  of  variability  into  the  speech  recogru- 
tion  system,  and  hurts  recognition  performance.  To  compensate  for  this  new  source  of  variability,  a  spectral 
mapping  approach  has  been  developed.  The  method  shows  improved  robustness  and  is  computationally 


efficient  as  well. 


"a  Neural-Net  Based,  In-Line  Focus/Exposure  Monitor,"  submitted  for  the 
M.S.  Degree,  by  Pamela  Tsai 

Abstract 

The  calibration  of  defocus  distance  and  exposure  time  in  lithographic  equipment  for  integrated 
circuits  fabrication  is  currently  performed  manually.  An  automated  approach  promises  better 
consistency  and  reproducibility  at  a  lower  cost.  The  two  critical  parameters  that  determine  the 
performance  of  a  lithographic  stepper  are  the  dcfocus  disunce  and  the  exposure  time.  Currently, 
the  optimal  settings  are  selected  after  examinmg  a  pattern  that  has  been  projected  several  times 
across  one  wafer.  Each  projection  is  done  under  a  different  combination  of  exposure  time  and 
defocus.  The  “best”  pattern  is  chosen  by  an  experienced  operator,  who  looks  for  the  image  that 
appears  to  be  the  sharpest,  having  the  most  vertical  sidewalls,  and  whose  critical  dimensions  are 
the  closest  to  those  of  the  desired  pattern.  The  focus  and  exposure  settings  corresponding  to  this 
image  are  then  selected  as  the  settings  to  use.  This,  for  example,  is  done  when  choosing  the  best 
exposure  and  identifying  current  focus  in  using  a  SMDLE  or  Bossung  plot.  This  calibration 
procedure  has  to  be  repeated  periodically  since  the  stepper,  the  light  source  and  the  chemicals 
tend  to  age.  Calibration  is  also  necessary  whenever  maintenance  is  performed,  or  whenever  the 
machine  is  configured  for  the  patterning  of  a  new  layer. 

In  this  project  we  applied  a  two  dimensional  pattern  recognition  network  which  was  trained  to 
choose  the  “best”  developed  image.  We  collected  a  database  of  digitized  optical  calibration 
images  generated  on  our  stepper  and  tagged  with  a  qualification  code  supplied  by  a  human 
expert.  A  feed  forward  network  was  trained  using  the  backpropagation  training  algorithm  to 
recognize  key  aspects  of  the  patterns  exposed  under  different  stepper  settings.  We  used  image 
processing  techniques  (such  as  edge  extraction  and  convolution)  to  pre-process  the  data  before  it 
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Abstract 

Vertical-cavity  Laser  Diodes  Fabricated  by 
Phase-locked  Epitaxj’ 
by 

Jeffrey  David  Walker 

Doctor  of  Philosophy  in  Electrical  Engineering  and  Computer  Sciences 
University  of  California  at  Berkeley 
Professor  John  Stephen  Smith,  Chair 

The  fabrication  of  vertical-ca%'ity  surface-emitting  laser  diodes  (VCSELs)  hais 
challenged  the  capabilities  of  conventional  thin  film  growth  techniques  such  as 
MBE  because  of  the  stringent  requirements  on  layer  thickness  and  interface  flatness 
required  to  produce  high  reflectivity  AlGaAs  Bragg  reflectors.  This  dissertation 
presents  a  new  growth  technique  for  the  fabrication  of  VCSELs  that  is  based  on 
phase-locked  epitaxy  (PLE)  and  that  addresses  the  problems  associated  with  the 
growth  of  these  structures.  The  techniques  presented  here  are  the  first  to  extend 
PLE  toward  the  fabrication  of  precision  macroscopic  structures  such  as  VCSELs. 

Exerimental  and  theoretical  analysis  show  that  PLE-grown  Bragg  reflectors 
have  a  maximum  error  in  layer  periodicity  of  1%,  and  a  maximum  loss  due  to 
optical  scattering  of  0.01%  per  interface.  In  addition,  1.5%  uniformity  in  layer 
thickness  across  a  50  mm  wafer  is  achievable. 

Because  the  PLE  growth  technique  solves  the  thin-film  growth  problems  that 
have  hampered  VCSEL  development,  it  has  been  possible  to  fabricate  extremely 
high  quality  lasers.  The  lasers  presented  here  were  the  first  to  be  fabricated  using 
an  in  situ  feedback  technique  to  control  layer  thickness.  They  were  also  the  first 
VCSELs  to.lase  in  the  10  mW  CW  power  reinge,  the  first  to  utilize  lower  aluminum 
content  mirrors  to  control  series  resistance,  and  the  first  with  low  threshold  voltages 
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(1.6  V).  Furthermore,  this  growth  technique  is  capable  of  growing  high  quality 
VCSEL  wafers  within  tight  tolerances  and  with  near  1009c  yields  on  both  single¬ 
wafer  and  wafer-to- wafer  scales. 

In  addition  to  results  from  VCSELs  fabricated  by  PLE,  a  new  type  of  integrable 
device  cadled  a  deformable  Fabry-Perot  (DFP)  cavity  is  proposed.  The  DFP  hats 
potential  applications  for  broadly-tunable  surface- normal  detectors  and  lasers  for 
wavelength  division  multiplexing.  Theoretical  analysis  indicates  50  nm  wavelength 
tunability  with  a  10  V  applied  bias. 

Finally,  a  new  type  of  AlGaAs  multiple  quantum  well  5-20  funi  mid-infrared 
detector  is  proposed.  The  device  uses  modification  of  the  conduction  band  wave 
functions  to  allow  for  the  absorption  of  normal  incidence  light.  Theoretical  analysis 
shows  the  ability  to  achieve  1000  cm“*  absorption. 


Professor  John  Stephen  Smith 
Committee  Chairman 
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Hot-Carrier  Currents  of  SOI  MOSFETs 

Hsing-ien  Wann.  Joe  King,  Jian  Chen,  Ping  K.  Ko,  and  Chenming  Hu 
Department  of  Electrical  Engineering  and  Computer  Sciences 
University  of  California  at  Berkeley.  CA  94720 

MOSFETs  built  on  the  SOI  structure  exhibit  superior  short  channel  behaviors  over  the  bulk  MOSFETsfl], 
They  also  have  other  advantages  such  as  reduction  of  the  junction  capacitance,  radiation  hardness  and  ease 
for  device  isolation.  The  SOI  MOSFET  is  a  promising  candidate  for  future  device  scaling.  The  hot-carrier 
effect  that  increases  with  device  miniaturization  is  another  important  device  scaling  constraint  that  has  to 
be  considered  for  the  SOI  MOSFET.  The  hot-carrier  effect,  which  is  usually  monitored  by  the  substrate 
current  for  the  NMOSFET  and  the  gate  current  for  the  PMOSFET  for  bulk  devices,  are  closely  related  to 
the  high  channel  electric  field  near  the  drain[2].  The  quasi-two-dimensionai  (quasi-2D)  model  provides  the 
link  between  the  hot-carrier  currents  and  the  device  design  parameters  for  the  bulk  MOSFETs[3).  In  this 

V  -V 

model  the  maximum  channel  field  is  E„  =— - with  the  charactcnstic  length  t  =  0.22taJ''^ X 

^  CM  J 

The  hot-carrier  currents  are  given  by: 

-t.T.id/ 

^svB  =  ~  ^DSAT  (H  fo*"  substratc  current  in  NMOSFET[2],  and 

1  t  ( XE  \  "®*/ 

=0.5-^^  — =-  P{Eox)e  (2)  for  the  gate  current  m  PMOSFET[4]. 


There  were  many  attempts  trying  to  apply  the  quasi-2D  model  to  study  the  hot-carrier  effect  of  SOI 
MOSFETs[5-7].  However  the  substrate  current  can  not  be  measured  directly,  and  the  definition  of  the 
junction  depth  Xj  is  missing  in  thin-film  SOI  MOSFETs.  In  some  works  tj,,  the  silicon  film  thickness  has  be 


used  for  Xj.  The  validity  was  not  justified. 

Fig.  1  shows  the  simulated  maximum  channel  elertric  field  compared  to  what  the  bulk  model  predicts  with 
the  Xj  replaced  with  ts,.  Such  a  substitution  results  in  too  strong  the  dependence  on  tsi  to  the  channel  field. 
Fig.2  shows  that  the  lateral  doping  gradient  plays  an  important  role  in  determining  the  high  channel  field. 
Note  that  in  the  bulk  MOSFET,  the  dependence  Xj  of  the  channel  field  comes  from  two  effects,  the  2D 
effect  and  the  lateral  junction  gradient  effects.  These  two  effects  are  strongly  correlated  because  X^  and  the 
lateral  diffusion  are  formed  during  the  same  processing  steps.  However  in  SOI  MOSFET  such  a  correlation 
is  missing.  We  need  to  refine  the  model  by  considering  the  2D  effect  and  the  lateral  doping  gradient  effect 
separately.  This  is  done  by  solving  the  Poisson  equation  in  the  velocity  saturation  region  with  the  lateral 
drain  doping  profile  approximated  by  the  exponential  function  around  the  concentration  of  2x  10“[8]: 


V  -V 

The  maximum  channel  field  is  found  to  be:  E„  =  — •  FRF,  with  FRF  shown  in  Fig.3,  and  f  is 

now  the  device  characteristic  length  for  the  MOSFET  with  abrupt  junctions.  For  the  MOSFET  with  thinner 
tsi,  the  channel  field  is  larger  due  to  stronger  2D  effect.  The  field  penetrates  into  the  drain  junction  with 
higher  concentration  and  is  reduced  more.  Therefore  the  sensitivity  on  ts,  is  weakened. 

Fig.4  and  5  show  the  experimental  data  of  hot-carrier  body  currents  for  SOI  NMOSFETs.  These  devices 
have  special  lay'outs  for  the  body  contacts  to  collect  the  hot-carrier  currents  generated  by  impact  ionization. 
Good  agreements  between  the  data  and  the  model  (2)  are  found  using  parameters  in  the  captions.  Fig.6 
show  the  hot-electron  current  data  for  SOI  PMOSFETs.  The  ratio  IbocIj/^d  does  not  increase  much  with 
thinner  tg^^  and  tjj.  Since  the  ratio  I<3„j/Ib<x^  depends  the  Eqj^,  the  device  with  thinner  tQ^  would  have  larger 
gate  currents. 
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Fig.3  The  "field  reduction  factor,"  defined  as  the  reduction  of 
En,  due  to  the  finite  drain  junction  gradient  as  opposed  to  an 
abrupt  junctions,  is  a  function  of  X/f  . 


Fig.5  NMOSFET  body  currents'  for  several  channel  lengths 
compared  with  (2).  The  parameters  used  in  the  model  are; 

tox=90A°,  tsi=760A“(both  measured),  and  X^130A°. 


Fig.2  The  channel  field  rises  exponentially  in  the  velocity 
saturation  region  as  predicted  by  the  quasi-2D  model.  The 
lateral  doping  gradient  is  important  in  determining 
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Fig.4  NMOSFET  body  currents  for  several  Ve’s  compared 
with  (2).  The  parameters  used  in  the  model  are;  tox=70A°, 
tsi=560A®(both  measured),  and  X,=130A°. 


Fig.6  The  measured  PMOSFET  body  currents  and  gate 
currents. 
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